JetKVM Advanced, CGO-based 2-way Audio Support #718

pennycoders · 2025-08-02T02:17:37Z

Summary

This PR introduces bidirectional audio support to JetKVM, enabling both audio output (listening to the managed device) and audio input (microphone from browser to device). Audio is implemented using an in-process CGO architecture that directly calls C code for ALSA audio capture/playback and Opus encoding/decoding. The managed device presents itself as a USB Audio Class 1 (UAC1) gadget providing both stereo speakers and a stereo microphone interface over USB.

Key Features:

Bidirectional stereo audio (48kHz, 16-bit, 2 channels)
In-process CGO implementation for low latency and simplicity
USB Audio Gadget (UAC1) integration
WebRTC-based real-time streaming with Opus codec
Frontend controls for enabling/disabling audio output and input
HDMI or USB audio capture source selection
SDP munging for proper stereo audio support in browsers

Credits

Thank you, @lqs as your [draft] USB Audio Support #446 PR has actually encouraged me to pursue this

Thanks!
Alex

CLAassistant · 2025-08-02T02:17:43Z

All committers have signed the CLA.

pennycoders · 2025-08-04T15:52:59Z

Great news! I'll soon update this PR with Audio Input pass-through functionality too

IDisposable · 2025-08-07T22:25:07Z

This is amazing!

Would it be possible to forward the audio channel on device's input HDMI to the browser?

By this I mean that if I set my host/controlled device's audio output to the JetKVM virtual monitor then the sound is going to be coming in the HDMI stream, which might be possible to extract (I know nothing about that hardware), so we could have the host-audio come through without an additional (virtual) audit device.

pennycoders · 2025-08-08T05:44:13Z

Hi @IDisposable

Glad you like this functionality, mainly to free up as much of that USB bandwidth. I'm actually looking at this, however, it is a little trickier as it moat likely requires changes in the rv1106-system repo containing the OS too.

In case I do manage to pull that off before the v0.5.0 release for which this functionality has been scheduled, I'll update this PR.

Thanks,
Alex

vvns · 2025-08-10T12:46:35Z

JetKVM Audio PR Review & Test Feedback

Hi @pennycoders ,

First, thanks for the work on bringing audio and mic support into JetKVM.

I’ve tested the new functionality in a local LAN environment with both playback and microphone streaming active, including during real-world scenarios like a Teams call.

Test conditions:

Setup: Wired LAN, low network latency, tested with a headset mic.
Modes tested: Low, Medium, High, Ultra for both playback and mic.

Main observations:

Mic quality constant across modes
- Microphone stream sounds the same in all modes.
- Quality is acceptable but not “HD” — there is a constant background noise floor, even with a headset mic on a clean LAN.
Ultra playback distortion
- In Ultra mode only, playback sometimes has a warped/buzzy/distorted effect.
- Low/Medium/High playback modes sound good and consistent.
Latency when mic is active
- Mouse and keyboard control become noticeably less responsive whenever the mic stream is active, even on a low-latency LAN connection.
- Likely due to video/control WebSocket traffic competing with audio packets on the same channel.
Packet loss
- Playback drop rate: ~22%
- Mic drop rate: ~13%
- Loss observed despite no network congestion, pointing to buffering or scheduling bottlenecks.

Potential Improvements (Technical):

Separate transport channels
- Move audio to a dedicated WebSocket endpoint (e.g. /ws/audio) or use WebRTC for audio transport.
- Prevents video/control from being delayed by audio bursts.
Opus tuning exposure
- Make parameters adjustable via UI or JSON config:
  - bitrate, frame size, complexity
  - FEC, DTX, VBR/CBR
- Lets users balance latency, quality, and bandwidth.
ALSA parameter control
- Expose period_time and buffer_time for fine-tuning latency vs underrun protection.
Queue management
- Use a bounded audio frame queue with drop-oldest to prevent latency spikes when encoding falls behind.
Noise reduction & echo cancellation
- Integrate RNNoise or WebRTC AEC/NS for mic clarity.
- Even simple high-pass filtering can reduce constant hum.
Thread/process separation
- Run audio encode/decode in its own goroutine/process to isolate timing from video/control.

Happy to re-run these tests and provide before/after metrics once adjustments are implemented.
This PR is already a big step forward, and with these improvements, we could get low-latency, clean mic audio without impacting remote control responsiveness.

pennycoders · 2025-08-10T18:17:24Z

Hi @vvns

Thanks! Thank you very much for putting this through its paces! This is great feedback, that I can definitely work with. I initially encountered the interference with the Keyboard & Mouse that you are mentioning and made some optimizations. Do you happen to know the commit hash you've tested at? Is it the latest version of my branch? I am asking because I've tested actual calls with the latest implementation and was definitely usable.

I will break down into your feedback and see what I can do about each of the items.

If you want we can discuss more in-depth on other channels too.

Thanks,

Alex

vvns · 2025-08-10T18:26:56Z

Hi @pennycoders ,

Glad the feedback is useful! 👍
I’ve confirmed that my tests were run on the latest commit at the time — 5f905e7 from your feat/audio-support branch — so the results I reported already include your most recent optimizations.

The version is indeed usable, but in slightly more demanding conditions (e.g., during calls or with sustained mic usage) the remote control latency — which was very low before the audio feature — increases significantly, to the point where slow mouse movement becomes noticeably delayed.

I can still retest to be sure nothing was missed, but the latency impact with mic active, packet loss, and Ultra mode distortion were all observed on that commit.

I’m happy to continue sharing feedback as you push further updates, so we can iterate quickly toward the best possible audio and control experience. Let me know which channel you’d prefer for more direct discussion, so you can share any details privately if needed.

Thanks again for the great work — we’re close to a fully smooth audio + control experience.

pennycoders · 2025-08-10T19:09:56Z

Hi @vvns,

Are you on the JetKVM Discord?

If so, we can discuss there. What's your username?

Thanks,
Alex

IDisposable

This really looks nice, all my comments are questions or nits, just feel free to ignore... I wonder if we need to be more explicit in the priority assignment of the other RTC channels (medai/serial/rpc) as we really want to ensure the control signals get through at very high fidelity ... might even be worthwhile splitting up the RPC messages into control vs. advisory messaging, but that's not this PR :)

cloud.go

dev_deploy.sh

input_rpc.go

ui/src/components/WebRTCVideo.tsx

ui/src/components/popovers/AudioControlPopover.tsx

ui/src/hooks/useMicrophone.ts

webrtc.go

cloud.go

am-zed · 2025-08-23T00:50:02Z

Audio works in Firefox and Chrome, but clicking on Audio button in Chrome throws error "Cannot read properties of undefined (reading 'addEventListener')":

pennycoders · 2025-08-24T11:18:40Z

Audio works in Firefox and Chrome, but clicking on Audio button in Chrome throws error "Cannot read properties of undefined (reading 'addEventListener')":

Hi! I am actually actually developing and testing this using Chrome. You mean the Audio button in the Actions bar (The top menu), right? How did you deploy the feat/audio-support branch to your JetKVM? Also, what is your Chrome version? Is there something particular about your networking setup, such as WebSockets or WebRTC being blocked? Any funkier chrome extensions installed? Can you please try again with the latest version of the branch please?

Thanks!

The script was copied but never executed, causing Docker-based builds (via dev_deploy.sh) to fail due to missing ALSA/Opus/SpeexDSP libraries. Reported-by: J-Bu

J-Bu · 2025-12-10T15:19:36Z

I flashed the image and app with the HDMI fixes and it is working for me. But to be fair I did also not notice any clicks/pops before those changes.

pennycoders · 2025-12-10T17:20:43Z

I flashed the image and app with the HDMI fixes and it is working for me. But to be fair I did also not notice any clicks/pops before those changes.

Thanks! They were happening at low volume, when sound with long silence periods was playing. I've spotted it on an Ubuntu 24.04 laptop.

It was one of those things driving me nuts because I couldn't figure it out... until I did. Just for reference, this was the video that made it occur: https://youtu.be/ucZl6vQ_8Uo?si=HpnJRBdnBJFn8OeA

IDisposable · 2025-12-11T10:41:59Z

Resolved the merge conflicts... we should probably delete all changes to the language files messages (except the English) and then rerun npm run i18n:machine-translate because (especially with the ZH Chinese) thing have been updated a lot since this merry journey started.

pennycoders · 2025-12-11T15:26:10Z

Resolved the merge conflicts... we should probably delete all changes to the language files messages (except the English) and then rerun npm run i18n:machine-translate because (especially with the ZH Chinese) thing have been updated a lot since this merry journey started.

Thanks for fixing the conflicts! I definitely agree on the amount of times translations have changed, but does that command translate automatically? or does it just use english?

IDisposable · 2025-12-11T15:47:36Z

Resolved the merge conflicts... we should probably delete all changes to the language files messages (except the English) and then rerun npm run i18n:machine-translate because (especially with the ZH Chinese) thing have been updated a lot since this merry journey started.

Thanks for fixing the conflicts! I definitely agree on the amount of times translations have changed, but does that command translate automatically? or does it just use english?

I'll make a run at making sure things are in sync tomorrow. The machine-translate just uses English to fill in any missing messages in the other languages. So if you take the exact version of ZH from current dev (without any of your new strings) and then run npm run i18n:machine-translate it will fill in the default translations (in all languages) for any missing strings.

Merged changes from dev branch including: - Diagnostics logging and download feature (jetkvm#1078) - E2E test infrastructure improvements - UI component updates and refactoring - Japanese keyboard layout support - Various bug fixes and improvements Resolved conflicts in: - jsonrpc.go: Combined audio RPC handlers with diagnostics - SettingsItem.tsx: Kept enhanced badge implementation - WebRTCVideo.tsx: Kept isSecureContext() utility - useJsonRpc.ts: Merged failsafe blocked methods - devices.$id.settings.video.tsx: Kept EDID initialization logic - devices.$id.tsx: Combined audio settings with E2E test handlers

- Fix prettier formatting in SettingsItem, WebRTCVideo, AudioPopover - Replace react-hook-form watch() with useWatch() for React Compiler compatibility - Remove unused dependencies from useCallback - Add eslint-disable for known React ref false positive

DonOregano · 2025-12-26T11:23:26Z

I would be happy to test this out on my device, if that allows this PR to progress. I am not sure how to obtain a build to put on there though. Do I have to build my own, or is there an "approved builder"?

0-don · 2025-12-27T22:16:03Z

looking forward to this

Remove the ability to select between HDMI and USB for audio output. Audio output now always uses the TC358743 HDMI capture device. - Remove AudioOutputSource from config and RPC handlers - Remove audio source UI dropdown and store state - Clean up unused translations across all locales - Update C audio comments to reflect HDMI-only output

Machine translate new audio-related keys to all supported languages: da, de, es, fr, it, nb, sv, zh

beetahnator · 2025-12-29T19:41:14Z

I just tested this on my device, it works really well!

I was able to listen to Spotify and do a Teams Call so both speakers and mic worked

Quality was good too

Marvur · 2026-01-08T17:12:25Z

Thanks @pennycoders for taking this to the finish line. Tested this on my device, works well out of the box.

However, the microphone audio capture on remote device is "high pitched" than the normal voice (think Chip n Dale voice).

Remote machine
Debian 12 x64

Config

Mode - HTTPS
Video - JetKVM Default
Hardware - Keyboard, Absolute Mouse (Pointer), USB Audio
Audio - Default configuration

pennycoders · 2026-01-08T21:10:38Z

Thanks @pennycoders for taking this to the finish line. Tested this on my device, works well out of the box.

However, the microphone audio capture on remote device is "high pitched" than the normal voice (think Chip n Dale voice).

Remote machine

Debian 12 x64

Config

Mode - HTTPS

Video - JetKVM Default

Hardware - Keyboard, Absolute Mouse (Pointer), USB Audio

Audio - Default configuration

I've tested this on MacOS, Ubuntu 24.04, Windows 10 and Windows 11 as an OS of the remote PC. I must admit I've never tested on a Debian 12 x64. I don't have access to one. Maybe one solution would be to use some sort of input hardware sample detection and dynamic resampling. In any case, I need to reproduce this first.

…cs handler

Marvur · 2026-01-09T19:48:22Z

Another critical observation : If the mic is enabled for a session, and inputs audio is not "used" - all audio packets until they are "used" (either recorded / calls) on remote device are buffered.

E.G - Mic enabled at 10:00AM > Recording started at 10:01 AM > Recorded audio has 10:00 to 10:01 audio before actual audio from 10:01.

pennycoders · 2026-01-10T21:33:41Z

Another critical observation : If the mic is enabled for a session, and inputs audio is not "used" - all audio packets until they are "used" (either recorded / calls) on remote device are buffered.

E.G - Mic enabled at 10:00AM > Recording started at 10:01 AM > Recorded audio has 10:00 to 10:01 audio before actual audio from 10:01.

That is Very Strange. It looks / feels to me like a USB Host-side issue though. As stated, I did run quite extensive tests on the platforms I've mentioned above.

- Add discrete note about potential reboot requirement after USB reconfiguration to restore audio input - Clarify USB Audio labels to specify "Audio Input" since it refers to browser microphone -> target computer flow - Update description to explain the feature better

Marvur · 2026-01-14T03:25:57Z

Another critical observation : If the mic is enabled for a session, and inputs audio is not "used" - all audio packets until they are "used" (either recorded / calls) on remote device are buffered.
E.G - Mic enabled at 10:00AM > Recording started at 10:01 AM > Recorded audio has 10:00 to 10:01 audio before actual audio from 10:01.

That is Very Strange. It looks / feels to me like a USB Host-side issue though. As stated, I did run quite extensive tests on the platforms I've mentioned above.

Indeed strange. Tested with a M5 Macbook Pro Host on Sequoia 15.7.3 from a Windows Guest on Chrome (same network) with the same results. Any pointers on what remotely could be causing this?

Snake4life · 2026-01-15T09:07:13Z

Would it be possible to add an option to edit the class of the audio USB device and make it look like an actual headset with microphone instead of a virtual device? The problem is that some companies block USB devices and only allow specific vendors such as Jabra & Plantronics for example.

pennycoders · 2026-01-15T09:51:55Z

Would it be possible to add an option to edit the class of the audio USB device and make it look like an actual headset with microphone instead of a virtual device? The problem is that some companies block USB devices and only allow specific vendors such as Jabra & Plantronics for example.

Shouldn't that be possible from Hardware settings, by updating the USB Identifiers? Haven't paid much attention to that though

Snake4life · 2026-01-15T10:02:33Z

Would it be possible to add an option to edit the class of the audio USB device and make it look like an actual headset with microphone instead of a virtual device? The problem is that some companies block USB devices and only allow specific vendors such as Jabra & Plantronics for example.

Shouldn't that be possible from Hardware settings, by updating the USB Identifiers? Haven't paid much attention to that though

oh it is possible, thanks!

pennycoders · 2026-01-15T11:46:35Z

Would it be possible to add an option to edit the class of the audio USB device and make it look like an actual headset with microphone instead of a virtual device? The problem is that some companies block USB devices and only allow specific vendors such as Jabra & Plantronics for example.

Shouldn't that be possible from Hardware settings, by updating the USB Identifiers? Haven't paid much attention to that though

oh it is possible, thanks!

Please validate on a home windows laptop first though.

Marvur · 2026-01-17T01:04:10Z

Another critical observation : If the mic is enabled for a session, and inputs audio is not "used" - all audio packets until they are "used" (either recorded / calls) on remote device are buffered.
E.G - Mic enabled at 10:00AM > Recording started at 10:01 AM > Recorded audio has 10:00 to 10:01 audio before actual audio from 10:01.

That is Very Strange. It looks / feels to me like a USB Host-side issue though. As stated, I did run quite extensive tests on the platforms I've mentioned above.

Indeed strange. Tested with a M5 Macbook Pro Host on Sequoia 15.7.3 from a Windows Guest on Chrome (same network) with the same results. Any pointers on what remotely could be causing this?

Found a temporary workaround in case if anyone else faces this issue -

Turn the mic "off" and "on" from webui. All previously "accumulated" audio gets nulled
OR
Start your activity that uses mic - join your call or record audio and then turn the mic on from webui

pennycoders changed the title ~~JetKVM Advanced, CGO Audio Support~~ JetKVM Advanced, CGO-based Audio Support Aug 2, 2025

pennycoders force-pushed the feat/audio-support branch from db2d107 to 4f47d62 Compare August 2, 2025 02:23

adamshiervani added this to JetKVM Aug 4, 2025

adamshiervani added this to the 0.5.0 milestone Aug 4, 2025

adamshiervani moved this to Backlog in JetKVM Aug 4, 2025

adamshiervani moved this from Backlog to In progress in JetKVM Aug 4, 2025

adamshiervani moved this from In progress to In review in JetKVM Aug 4, 2025

adamshiervani moved this from In review to In progress in JetKVM Aug 4, 2025

adamshiervani moved this from In progress to In Review in JetKVM Aug 4, 2025

adamshiervani requested review from adamshiervani and ym August 4, 2025 12:26

adamshiervani mentioned this pull request Aug 4, 2025

[draft] USB Audio Support #446

Closed

3 tasks

adamshiervani linked an issue Aug 4, 2025 that may be closed by this pull request

Add sound support #315

Open

pennycoders force-pushed the feat/audio-support branch from c9f4aea to 3444607 Compare August 4, 2025 20:42

pennycoders changed the title ~~JetKVM Advanced, CGO-based Audio Support~~ JetKVM Advanced, CGO-based 2-way Audio Support Aug 4, 2025

adamshiervani mentioned this pull request Aug 5, 2025

Add sound support #315

Open

IDisposable reviewed Aug 13, 2025

View reviewed changes

pennycoders force-pushed the feat/audio-support branch from 7408195 to 767311e Compare August 13, 2025 11:35

IDisposable reviewed Aug 13, 2025

View reviewed changes

cloud.go Outdated Show resolved Hide resolved

IDisposable mentioned this pull request Aug 22, 2025

Rework keyboard management to allow device-side tracking of modifier states #725

Merged

pennycoders added 2 commits December 8, 2025 20:40

fix(build): run install_audio_deps.sh in Dockerfile.build

9aa2c03

The script was copied but never executed, causing Docker-based builds (via dev_deploy.sh) to fail due to missing ALSA/Opus/SpeexDSP libraries. Reported-by: J-Bu

Merge branch 'dev' into feat/audio-support

c91bcfc

Merge branch 'dev' into feat/audio-support

c52924b

pennycoders added 2 commits December 24, 2025 00:07

pennycoders added 3 commits December 29, 2025 20:38

i18n: add audio feature translations for all languages

533753e

Machine translate new audio-related keys to all supported languages: da, de, es, fr, it, nb, sv, zh

Merge branch 'dev' into feat/audio-support

73bb249

Merge branch 'dev' into feat/audio-support

34329ec

pennycoders added 2 commits January 9, 2026 20:11

Merge branch 'dev' into feat/audio-support

8526d6e

fix: remove unused supervisor import and non-existent rpcGetDiagnosti…

41f32f2

…cs handler

JetKVM Advanced, CGO-based 2-way Audio Support #718

Are you sure you want to change the base?

JetKVM Advanced, CGO-based 2-way Audio Support #718

Uh oh!

Conversation

pennycoders commented Aug 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Credits

Uh oh!

CLAassistant commented Aug 2, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pennycoders commented Aug 4, 2025

Uh oh!

IDisposable commented Aug 7, 2025

Uh oh!

pennycoders commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

vvns commented Aug 10, 2025

Main observations:

Potential Improvements (Technical):

Uh oh!

pennycoders commented Aug 10, 2025

Uh oh!

vvns commented Aug 10, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

pennycoders commented Aug 10, 2025

Uh oh!

IDisposable left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

am-zed commented Aug 23, 2025

Uh oh!

pennycoders commented Aug 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

J-Bu commented Dec 10, 2025

Uh oh!

pennycoders commented Dec 10, 2025

Uh oh!

IDisposable commented Dec 11, 2025

Uh oh!

pennycoders commented Dec 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

IDisposable commented Dec 11, 2025

Uh oh!

DonOregano commented Dec 26, 2025

Uh oh!

0-don commented Dec 27, 2025

Uh oh!

beetahnator commented Dec 29, 2025

Uh oh!

Marvur commented Jan 8, 2026

Uh oh!

pennycoders commented Jan 8, 2026

Uh oh!

Marvur commented Jan 9, 2026

Uh oh!

pennycoders commented Jan 10, 2026

Uh oh!

Marvur commented Jan 14, 2026

Uh oh!

Snake4life commented Jan 15, 2026

Uh oh!

pennycoders commented Jan 15, 2026

pennycoders commented Aug 2, 2025 •

edited

Loading

CLAassistant commented Aug 2, 2025 •

edited

Loading

pennycoders commented Aug 8, 2025 •

edited

Loading

vvns commented Aug 10, 2025 •

edited

Loading

pennycoders commented Aug 24, 2025 •

edited

Loading

pennycoders commented Dec 11, 2025 •

edited

Loading