If you’re looking for a hosted desktop recording API, consider checking out Recall.ai, an API that records Zoom, Google Meet, Microsoft Teams, In-person meetings, and more.
With macOS 14.4, Apple introduced new API in CoreAudio that allows any app to capture audio from other apps or the entire system, as long as the user has given the app permission to do so.
Unfortunately this new API is poorly documented and the nature of CoreAudio makes it really hard to figure out exactly how to set things up so that your app can use this new functionality.
This project is provided as documentation for this new API to help developers of audio apps.
AudioCap.mp4
Here’s a brief summary of the new API added in macOS 14.4 and how to put everything together.
As you’d expect, recording audio from other apps or the entire system requires a permission prompt.
The message for this prompt is defined by adding the NSAudioCaptureUsageDescription key to the app’s Info.plist. This key is not listed in the Xcode dropdown, you have to enter it manually.
There’s no public API to request audio recording permission or to check if the app has that permission. This project implements permission check/request using private API from the TCC framework, but there is a build-time flag to disable private API usage, in which case the permission will be requested the first time audio recording is started in the app.
Assuming the app has audio recording permission, setting up and recording audio from other apps can be done by performing the following steps:
- Get the PID of the process you wish to capture
- Use kAudioHardwarePropertyTranslatePIDToProcessObject to translate the PID into an
AudioObjectID - Create a CATapDescription for the object ID above, and set (or just get) its
uuidproperty, which will be needed later - Call AudioHardwareCreateProcessTap with the tap description to create the tap, which gets its own
AudioObjectID - Create a dictionary for your aggregate device that includes
[kAudioSubTapUIDKey: <your tap description uuid string>]in itskAudioAggregateDeviceTapListKey(you probably want to configure other things, such as settingkAudioAggregateDeviceIsPrivateKeyto true so that it doesn’t show up globally) - Call AudioHardwareCreateAggregateDevice with the dictionary above
- Read
kAudioTapPropertyFormatfrom the process tap to get itsAudioStreamBasicDescription, then create anAVAudioFormatmatching the description, this will be needed later - Create an
AVAudioFilefor writing with your desired settings - Call
AudioDeviceCreateIOProcIDWithBlockto set up a callback for your aggregate device - Inside the callback, create an
AVAudioPCMBufferpassing in your format; you can usebufferListNoCopywithnildeallocator then just callwrite(from:)on your audio file, passing in the buffer - Call
AudioDeviceStartwith the aggregate device and IO proc ID - Remember to call all your
Audio...StopandAudio...Destroycleanup functions - Let the
AVAudioFiledeinit to close it - Now you have an audio file with a recording from the system or app
Thanks to @WFT for helping me with this project.