Real TCP socket glue, Rust example binaries, and Cargo-native CI#27
Conversation
…ve CI - Server::listen/poll and Client::connect now drive real TCP sockets through the Transport abstraction, performing the full handshake + connect/createStream/publish/play AMF0 exchange and frame I/O. - Fix malformed connect _result response (missing information object) and a mem::zeroed() UB panic on Frame, both only reachable once the socket glue made these paths exercisable. - Add tests/server_client_loopback.rs, an end-to-end test over real loopback sockets. - Port the old C examples and interop test harnesses to Rust (minimal_server, minimal_client, ffmpeg_ingest, play_pull) and rewire tests/interop/*.sh to build and drive them via cargo. - Re-enable abi-check, release, and interop workflows on push/pull_request now that they have working Cargo-native and Rust-native counterparts.
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (2)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 WalkthroughWalkthroughReplaces RTMP stubs with Rust client and server networking, adds example binaries and interop tests, and updates CI workflows for semver checks and Cargo-based release packaging. ChangesRust RTMP Port
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 17
🧹 Nitpick comments (1)
tests/server_client_loopback.rs (1)
16-20: 🎯 Functional Correctness | 🔵 Trivial | ⚡ Quick winAssert the published frame contents, not just “some bytes arrived.”
on_framecounts any non-empty frame, so this test still passes if the server delivers the wrong frame type or a truncated/corrupted payload.Framealready exposesframe_type,size, anddatainsrc/types.rs:182-202; copy the bytes in the callback and assert they equal the sent 32-byte[0xAB; 32].Also applies to: 72-72
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@tests/server_client_loopback.rs` around lines 16 - 20, The loopback test only increments FRAMES_RECEIVED for any non-empty frame, so it can pass even when the server sends the wrong or corrupted payload. Update on_frame to inspect the Frame contents via frame_type, size, and data, capture the received bytes, and assert they match the expected 32-byte payload of [0xAB; 32]. Use the existing Frame fields in the callback and tighten the test so it validates exact content rather than merely counting bytes.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In @.github/workflows/abi-check.yml:
- Around line 31-41: The baseline selection in the Determine baseline ref step
currently includes the tag that triggered the workflow, causing cargo
semver-checks to compare a release against itself. Update the PREV_TAG lookup so
it explicitly excludes the current refs/tags/v* ref from the git tag search
before setting skip/tag in the baseline step, using the existing baseline job
logic and GITHUB_OUTPUT values.
- Around line 16-18: The ABI check workflow is using the newest tag as its own
baseline on v* pushes, so update the workflow to select the previous release tag
instead of the latest one. Adjust the tag-selection logic in the abi-check job
so the step that feeds cargo semver-checks excludes the currently pushed tag and
resolves the prior version tag, keeping the existing checkout and semver-checks
flow intact.
In @.github/workflows/release.yml:
- Around line 80-85: The release packaging step is swallowing missing library
artifacts in the “Stage dist tree” script, which can allow an incomplete tarball
to be published. Update the dist staging logic to validate that both
`liblibrtmp2.so` and `liblibrtmp2.a` are present before continuing, and fail the
workflow with a clear error if either artifact is absent; use the existing
release job and the staging block around `DIST_DIR` as the place to enforce this
check.
In `@examples/ffmpeg_ingest.rs`:
- Around line 71-101: The failure path in ffmpeg_ingest’s main loop is using a
stale success flag; if server.poll(200) errors after the counters already
reached the thresholds, the test still reports a timeout. Recompute success from
the final VIDEO_FRAMES and AUDIO_FRAMES values right before the !success check,
and use that updated result when deciding whether to return ExitCode::from(2).
Keep the existing loop and diagnostics, but make the final decision in this
block reflect the latest counters rather than only the in-loop non-error path.
In `@examples/play_pull.rs`:
- Around line 59-82: The pass/fail logic in the pull example still depends on
the mutable success flag, which can miss a valid run if client.poll(200) returns
an error after the first audio/video frames arrive. Update the decision in the
main loop around client.poll, VIDEO_FRAMES, and AUDIO_FRAMES to evaluate the
final frame counters at the end instead of relying on success, and use those
counters to determine whether to print PASS or FAIL.
In `@src/client/mod.rs`:
- Around line 112-116: The `publish()` and `play()` flows in `Client` ignore the
result of `wait_for_command("onStatus")`, so they can return success and advance
`ClientState` even when the server rejects, times out, or disconnects. Update
both methods to check and propagate the `wait_for_command` result instead of
discarding it, and only set the state to `Publishing` or `Playing` after a
successful `onStatus` confirmation.
- Around line 364-368: The URL parsing in the client path split logic accepts
incomplete RTMP URLs, so update the parsing in the function that builds the
`(host, port, app, stream_key)` tuple to reject inputs missing either the app or
the stream key. After splitting the path, validate that both `app` and
`stream_key` are non-empty and return an error from this parser instead of
`Ok(...)` when the URL only contains `rtmp://host` or `rtmp://host/app`.
- Around line 320-323: The client-side blocking poll in
`recv_message()`/`read_exact()` currently waits forever with `libc::poll(...,
-1)`, which can hang `connect()`, `publish()`, or `play()` during handshake or
reads. Update the poll loop in `src/client/mod.rs` to use a finite timeout
instead of infinite blocking, and when the timeout expires return
`ErrorCode::Timeout`. Keep the change localized around the `again` retry path
and the `transport.fd()` poll handling so stalled peers fail cleanly.
- Around line 75-84: The connection setup in Client::connect leaks the socket if
do_handshake() fails because into_raw_fd() takes ownership before the handshake
succeeds and client_fd is only set afterward. Fix this by keeping the TcpStream
owned until the handshake completes, or by ensuring the raw fd is explicitly
closed on any error path in Client::connect while preserving the existing
Transport::new_plain and do_handshake flow. The nonblocking socket change is
unnecessary since the read path already uses MSG_DONTWAIT.
In `@src/server/mod.rs`:
- Around line 102-105: The accepted connection setup in Server::new / the
accept-path around Conn::new is always using Transport::new_plain(fd), which
silently ignores tls_ctx even when tls_enabled is set. Update this code to use a
TLS transport when Server::new initialized tls_ctx, wiring the accepted fd
through the TLS-capable Transport path; if that is not implemented yet,
explicitly reject TLS-enabled configurations instead of falling back to plain
TCP.
- Around line 97-106: Enforce the configured connection cap in the accept loop
inside `Server::run` before creating and pushing a new `Conn`. Check
`self.connections.len()` against `self.config.max_connections` immediately after
`listener.accept()` succeeds, and if the limit is reached, skip or reject the
peer without calling `into_raw_fd()` or `Transport::new_plain`. Keep the
existing connection setup path for accepted peers only when under the limit.
- Line 143: The flush in the connection handling path is currently ignored,
which can leave failed peers in self.connections after a write-side error.
Update the logic around conn.flush() to check its result and, on failure, treat
it as a broken connection by removing or closing that conn so it does not remain
tracked. Use the existing connection management flow in src/server/mod.rs around
the flush call and the self.connections collection to keep the cleanup
consistent.
- Around line 58-62: The address formatting in the server bind logic drops IPv6
brackets after split_host_port, so the addr built in the server module can
become invalid for TcpListener::bind. Update the bind address construction to
preserve or re-add brackets whenever the host from split_host_port contains a
colon, while keeping the existing behavior for empty and IPv4 host values.
In `@tests/interop/enhanced_rtmp_interop.sh`:
- Around line 42-58: In run_one, the ffmpeg publish step should be treated as
the primary test result instead of always waiting on the ingest server first.
Capture the ffmpeg exit status immediately after the timeout ffmpeg invocation,
and if it is non-zero, fail and return from run_one before calling wait on
"$srv"; keep the existing ingest exit handling only for the successful publish
path. Use the run_one workflow and the ffmpeg/ wait "$srv" sequence to locate
the change.
In `@tests/interop/ffmpeg_interop.sh`:
- Around line 38-50: The ffmpeg publish flow in ffmpeg_interop.sh does not fail
fast after recording FF_RC, so the script can hang waiting on SRV even when
publish never starts. Update the publish block around the ffmpeg invocation and
wait "$SRV" so that a nonzero FF_RC causes an immediate exit or explicit failure
path before waiting on the ingest server, while keeping the existing set -e
behavior intact for the rest of the script.
In `@tests/interop/play_interop.sh`:
- Around line 43-46: The startup flow in play_interop.sh currently relies on a
fixed sleep after launching mediamtx, which can let the script continue even if
the process died or never bound the RTMP port. Replace that sleep with an
explicit readiness check around the mediamtx launch: verify the MTX process is
still running and poll the RTMP port until it accepts connections before
starting ffmpeg or the play client. Keep the check close to the mediamtx startup
block so the downstream interop-play steps only run after readiness is
confirmed.
In `@tests/server_client_loopback.rs`:
- Around line 40-71: The client thread in server_client_loopback.rs does not
surface setup failures until join(), so failures in Client::connect, publish, or
send_frame can look like timeouts. Update the spawned client workflow to return
a Result through a channel or shared result so the main test can fail
immediately on the first error, and keep the existing poll loop only for waiting
on FRAMES_RECEIVED once client setup succeeds.
---
Nitpick comments:
In `@tests/server_client_loopback.rs`:
- Around line 16-20: The loopback test only increments FRAMES_RECEIVED for any
non-empty frame, so it can pass even when the server sends the wrong or
corrupted payload. Update on_frame to inspect the Frame contents via frame_type,
size, and data, capture the received bytes, and assert they match the expected
32-byte payload of [0xAB; 32]. Use the existing Frame fields in the callback and
tighten the test so it validates exact content rather than merely counting
bytes.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: c3ee9e95-0212-4167-acc8-14191584802a
📒 Files selected for processing (16)
.github/workflows/abi-check.yml.github/workflows/interop-ffmpeg.yml.github/workflows/interop-play.yml.github/workflows/release.ymlexamples/ffmpeg_ingest.rsexamples/minimal_client.rsexamples/minimal_server.rsexamples/play_pull.rssrc/client/mod.rssrc/server/mod.rssrc/session/conn.rssrc/types.rstests/interop/enhanced_rtmp_interop.shtests/interop/ffmpeg_interop.shtests/interop/play_interop.shtests/server_client_loopback.rs
Client::recv_message/poll never processed Set Chunk Size (msg type 1) control messages, so the chunk reader kept assuming the default 128-byte chunk size for all incoming csids. Real servers like mediamtx send Set Chunk Size right after connect and then immediately use the new size for the _result reply, desyncing our chunk parser and hanging the client until the peer's idle timeout fired. This is what broke interop-play in CI. Now both paths apply ChunkRegistry::set_all_chunk_size when they see this control message, matching what the server side already does via message::dispatch. Verified locally against a real mediamtx server and real ffmpeg publisher: connect/createStream/play now complete immediately and frames are pulled successfully. Co-Authored-By: Claude Sonnet 4.6 <[email protected]> Claude-Session: https://claude.ai/code/session_01G8jobRGbGgHuBs7251K587
…bit review - Client: propagate onStatus errors from publish()/play() instead of discarding them; close the fd on handshake failure to avoid a leak; validate parsed app/streamKey are non-empty; bound the recv-loop poll() calls with a finite timeout instead of blocking forever. - Server: refuse to start listen() when TLS is configured, since the accept path only ever wraps sockets as plaintext; enforce max_connections in the accept loop instead of ignoring it.
split_host_port() strips the brackets from a bracketed IPv6 literal, so re-wrap it in brackets when building the listener bind address (std's SocketAddr parser rejects bare "::1:1935"). Also reap a connection when flush() fails instead of silently ignoring the write error, matching how recv failures are already handled.
abi-check.yml: on a tag push, the just-pushed tag itself was the "latest" tag, so semver-checks compared the release against itself and never actually caught a breaking change. Exclude the current ref's tag when picking a baseline. release.yml: both the .so and .a copy steps swallowed errors with `|| true`, so a build that produced neither artifact would still package and publish an empty release. Fail the job if the staged lib directory ends up empty.
ffmpeg_interop.sh / enhanced_rtmp_interop.sh discarded ffmpeg's publish exit code, so a failed publish only surfaced indirectly as a confusing "ingest server exit=2" timeout message. Capture and check it directly. play_interop.sh used a fixed sleep to wait for mediamtx to start listening, which is flaky under load. Poll the RTMP port until it accepts a connection instead.
…nt in loopback test ffmpeg_interop.sh and enhanced_rtmp_interop.sh were failing the build whenever ffmpeg exited non-zero, but the ffmpeg_ingest example server intentionally closes the connection as soon as it has enough frames, before ffmpeg finishes its publish duration. That causes ffmpeg to see a benign "Connection reset by peer" and exit non-zero on a fully successful run. Pass/fail now relies solely on the ingest server's own exit code. Also tightens the loopback test to validate the received frame's content (not just non-zero size), and finalizes the release.yml artifact-staging fix to fail per-missing-artifact with the artifact name in the error.
…early ffmpeg_ingest and play_pull only flipped their success flag on the non-error poll path, so a poll() that delivered the final frames and then errored (e.g. on disconnect) was reported as a timeout despite meeting the frame thresholds. Recompute success from the final counters before deciding pass/fail. The loopback test's client thread also only surfaced connect/publish/ send_frame failures via join() after the full 5s poll deadline, making a deterministic setup failure look like a frame-receive timeout. Send the setup result over a channel so the main thread can fail immediately.
ffmpeg_interop.sh and enhanced_rtmp_interop.sh now only treat a non-zero ffmpeg exit as fatal when the ingest server is still running (i.e. ffmpeg died before the server got enough data). If the server has already exited, the reset is the server's own intentional early disconnect on success, so its exit code is authoritative. This keeps the hang-prevention CodeRabbit asked for without reintroducing the false failure caused by gating on ffmpeg's exit code alone. play_interop.sh now also bails out early if mediamtx dies during the readiness-poll loop, and uses mktemp for its log files instead of fixed /tmp paths.
A hardcoded, predictable path under /tmp is vulnerable to symlink/TOCTOU attacks: a local attacker could pre-create the file or a symlink and hijack or corrupt the log contents (CWE-377).
Summary
Server::listen/pollandClient::connectnow drive real TCP sockets through theTransportabstraction (std::net::TcpListener/TcpStreambridged to raw fds), performing the full handshake +connect/createStream/publish/playAMF0 exchange and frame send/receive — these were previously non-functional stubs.Conn::send_connect_responsewrote a malformedconnect_result(missing the information object after the propertiesnull), which made real clients fail to parse the response.mem::zeroed()onFramepanicked in debug builds becauseFramecontains#[repr(C)]enums (e.g.VideoCodec) whose lowest discriminant isn't 0; replaced with a properDefaultimpl.tests/server_client_loopback.rs, an end-to-end test that spins up a realServer+Clientover loopback TCP and verifies a published frame is observed.Dropimpls forConnandClientto close fds taken over viainto_raw_fd().examples/minimal_server.rs,examples/minimal_client.rs,examples/ffmpeg_ingest.rs,examples/play_pull.rs.tests/interop/*.shto build and drive the new Rust example binaries viacargo build --example, replacing the old C compile-and-run logic.abi-check.yml,release.yml,interop-ffmpeg.yml, andinterop-play.ymlonpush/pull_requestnow that they have working Cargo-native and Rust-native counterparts.Test plan
cargo build --all-features/--no-default-featurescargo test --all-features/--no-default-features(45 unit tests + new loopback integration test, all pass)cargo clippy --all-features --all-targets(no new warnings)cargo build --examples --all-features(all four example binaries compile cleanly)Generated by Claude Code
Summary by CodeRabbit