Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Latest commit

 

History

History
 
 

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 
 
 
 
 
 
 

README.md

Data Track Throughput Experiment

Coordinated producer and consumer for benchmarking LocalDataTrack / RemoteDataTrack throughput across a sweep of payload sizes and publish rates.

What It Does

  • producer.cpp

    • Publishes a data track named data-track-throughput
    • Runs a default sweep of payload sizes and publish rates (see Test Bounds below)
    • Calls the consumer over RPC before and after each scenario
  • consumer.cpp

    • Registers a room data-frame callback for the producer's data track
    • Receives every frame and records arrival timestamps
    • Logs validation warnings (size mismatches, header mismatches, etc.) to stderr
    • Tracks duplicates and missing messages
    • Appends raw data to scenario-level and per-message CSV files

Design Principles

  • Raw data only in CSV. The consumer writes only directly measured values (counts, byte totals, microsecond timestamps). All derived metrics (throughput, latency percentiles, delivery ratio, etc.) are computed at analysis time by scripts/plot_throughput.py.
  • Fixed packet size per scenario. Each scenario uses a single packet_size_bytes. This ensures every message in a run is the same size, making aggregate measurements unambiguous.
  • Minimal measurement overhead. The hot onDataFrame callback captures the arrival timestamp first, then appends to an in-memory vector under a brief mutex. File I/O happens only at finalization after all data is collected.

Test Bounds

All bounds are defined in common.h. A scenario is any combination of (payload size, publish rate) that passes all three constraints below.

Hard Limits

Parameter Min Max
Packet size 1 KiB 256 MiB
Publish rate 1 Hz 50k Hz

Data-Rate Budget

Every scenario must satisfy:

packet_size_bytes * desired_rate_hz <= 10 Gbps (1.25 GB/s)

This naturally allows small messages at very high rates and large messages at low rates while preventing any single scenario from attempting an unreasonable throughput that would destabilize the connection.

Sweep Grid

By default, the producer iterates over 7 payload sizes and 9 publish rates, skipping any combination that exceeds the data-rate budget:

Payload sizes: 1 KiB, 4 KiB, 16 KiB, 64 KiB, 128 KiB, 256 KiB, 512 KiB

Publish rates: 1, 5, 10, 25, 50, 100, 200, 500, 1k Hz

You can override either axis with comma-separated producer flags:

--sizes_kb 1,4,16,64
--freq_hz 10,50,100,500

--sizes_kb values are interpreted as KiB. --freq_hz values are interpreted as Hz. The producer runs every valid size/rate combination from the selected grid and skips combinations over the data-rate budget.

Single-scenario mode (--rate-hz, --packet-size, --num-msgs) bypasses the sweep grid and only enforces the hard limits and data-rate budget, allowing any valid combination to be tested explicitly.

CSV Output

The consumer writes raw measurement data only. All derived metrics are computed at analysis time by scripts/plot_throughput.py.

throughput_summary.csv

One row per scenario. Contains only raw counts, byte totals, and microsecond timestamps:

Column Description
run_id Unique scenario identifier
scenario_name Human-readable scenario label
desired_rate_hz Requested publish rate
packet_size_bytes Fixed packet size for this scenario
messages_requested Number of messages the producer was told to send
messages_attempted Number of messages the producer tried to send
messages_enqueued Number of messages successfully enqueued
messages_enqueue_failed Number of enqueue failures
messages_received Unique messages received by consumer
messages_missed messages_requested - messages_received
duplicate_messages Number of duplicate frames received
attempted_bytes Total bytes the producer attempted to send
enqueued_bytes Total bytes successfully enqueued
received_bytes Total bytes received by consumer
first_send_time_us Timestamp of first send (microseconds since epoch)
last_send_time_us Timestamp of last send
first_arrival_time_us Timestamp of first arrival at consumer
last_arrival_time_us Timestamp of last arrival at consumer

throughput_messages.csv

One row per received frame. Raw observation data only:

Column Description
run_id Scenario identifier
sequence Message sequence number
payload_bytes Actual payload size received
send_time_us Producer send timestamp (microseconds since epoch)
arrival_time_us Consumer arrival timestamp (microseconds since epoch)
is_duplicate 1 if this sequence was already seen, 0 otherwise

Prerequisites

  • CMake 3.20+
  • C++17 compiler
  • The LiveKit C++ SDK, built and installed (see below)

Building

All commands below assume you are in this directory (data_track_throughput/).

1. Build and install the SDK

From the SDK repository root:

./build.sh          # builds the SDK (debug by default)
cmake --install build-debug --prefix local-install

2. Configure this experiment

cmake -S . -B build \
  -DCMAKE_PREFIX_PATH="$(cd ../../local-install && pwd)"

Adjust the CMAKE_PREFIX_PATH to wherever the SDK was installed. The path above assumes this directory lives two levels below the repository root; it works regardless of the parent directory's name.

3. Build

cmake --build build

The executables and required shared libraries are placed in build/.

Build Targets

  • DataTrackThroughputConsumer
  • DataTrackThroughputProducer

Running

Generate Tokens

# producer
lk token create \
  --api-key devkey \
  --api-secret secret \
  -i producer \
  --join \
  --valid-for 99999h \
  --room robo_room \
  --grant '{"canPublish":true,"canSubscribe":true,"canPublishData":true}'

# consumer
lk token create \
  --api-key devkey \
  --api-secret secret \
  -i consumer \
  --join \
  --valid-for 99999h \
  --room robo_room \
  --grant '{"canPublish":true,"canSubscribe":true,"canPublishData":true}'

Start the local server:

LIVEKIT_CONFIG="enable_data_tracks: true" livekit-server --dev

Start the consumer first:

./build/DataTrackThroughputConsumer <ws-url> <consumer-token>

Then start the producer:

./build/DataTrackThroughputProducer <ws-url> <producer-token> --consumer consumer

If you omit --consumer, the producer expects exactly one remote participant to already be in the room.

Custom Sweep

To run a smaller or denser grid, pass comma-separated lists to the producer:

./build/DataTrackThroughputProducer \
  <ws-url> <producer-token> \
  --consumer consumer \
  --sizes_kb 1,16,256,512 \
  --freq_hz 25,100,500,1000 \
  --messages-per-scenario 50

Single Scenario

Instead of the full sweep, you can run one scenario:

./build/DataTrackThroughputProducer \
  <ws-url> <producer-token> \
  --consumer <consumer-identity> \
  --rate-hz 50 \
  --packet-size 1mb \
  --num-msgs 25

Plotting

Generate plots from a benchmark output directory:

python3 scripts/plot_throughput.py data_track_throughput_results

By default the script writes PNGs into data_track_throughput_results/plots/. Pass --output-dir <path> to override the output location.

All derived metrics (throughput, latency percentiles, delivery ratio, receive rate, interarrival times) are computed from the raw CSV timestamps and counts at plot time.

Generated Plots

From throughput_summary.csv + throughput_messages.csv:

File Description
expected_vs_actual_throughput.png Scatter plot comparing expected vs actual receive throughput (Mbps). Points are colored by desired publish rate and sized by payload. An ideal y=x reference line is overlaid.
dropped_messages_vs_expected_throughput.png Scatter plot of missed/dropped message count vs expected throughput, colored by payload size (log scale).
actual_throughput_heatmap.png Heatmap of actual receive throughput (Mbps) with payload size on the y-axis and desired rate on the x-axis.
delivery_ratio_heatmap.png Heatmap of delivery ratio (received / requested) over the same payload-size x rate grid.
p50_latency_heatmap.png Heatmap of median (P50) send-to-receive latency (ms) over the same grid.
p95_latency_heatmap.png Heatmap of P95 send-to-receive latency (ms) over the same grid.
message_latency_histogram.png Histogram of per-message latency (ms) across all received frames.
message_interarrival_series.png Time-series line plot of inter-arrival gaps (ms) for every received message, ordered by run then arrival time.