Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Conversation

@goshawk-3
Copy link
Contributor

@goshawk-3 goshawk-3 commented Dec 11, 2025

This adds a non-blocking, distributed locking mechanism that coordinates dependence-chain processing across multiple tfhe-workers replicas.

A worker can acquire a lock of the next available dependence-chain entry for processing ordered by last_updated_at (FIFO queue-like approach).

A permission to acquire a DCID depends on either

  • dependency_count is 0 and DCID is not locked
    or

  • dependency_count is 0 and DCID is locked but the lock has expired

  • Ownership expires after a timeout, enabling work-stealing by other workers for resilience.

  • GC procedure is regularly executed to clean up processed DCIDs

@cla-bot cla-bot bot added the cla-signed label Dec 11, 2025
@mergify
Copy link

mergify bot commented Dec 11, 2025

🧪 CI Insights

Here's what we observed from your CI run for 1df24ba.

🟢 All jobs passed!

But CI Insights is watching 👀

@rudy-6-4
Copy link
Contributor

values.yaml new parameters are missing

@goshawk-3
Copy link
Contributor Author

See also: #1506 (comment)

@goshawk-3 goshawk-3 changed the title Feature/tfhe worker/scalability feat(coprocessor): add a non-blocking, distributed locking mechanism across multiple tfhe-workers Dec 16, 2025
@goshawk-3 goshawk-3 changed the title feat(coprocessor): add a non-blocking, distributed locking mechanism across multiple tfhe-workers feat(coprocessor): add a non-blocking, distributed locking mechanism in tfhe-worker Dec 16, 2025
@goshawk-3 goshawk-3 force-pushed the feature/tfhe-worker/scalability branch from 687e931 to e63b299 Compare December 16, 2025 10:36
@antoniupop
Copy link
Collaborator

New CLI params

  • --worker-id
  • --dcid-ttl-sec
  • --dcid-timeslice-sec
  • --disable-dcid-locking

Please could you update the charts with these (or any new params added) - I think we've mostly converged on the arch, so would be good to start planning for deployment.

@goshawk-3 goshawk-3 force-pushed the feature/tfhe-worker/scalability branch 4 times, most recently from b642f1b to c35b36c Compare December 19, 2025 16:06
…iple workers

It provides a non-blocking, distributed locking mechanism that
coordinates dependence-chain processing across multiple tfhe-workers.

A worker can acquire ownership of the next available dependence-chain entry for processing
ordered by last_updated_at (FIFO queue-like approach).

Ownership expires after a timeout, enabling work-stealing by other workers.

New CLI param --worker_id
- Added LockingReason for logging
- Make expiry configurable
- add --dcid_ttl_sec config
- add otel traces for dcid
@antoniupop antoniupop force-pushed the feature/tfhe-worker/scalability branch from 37097ca to eabc5bc Compare December 21, 2025 08:08
@antoniupop antoniupop force-pushed the feature/tfhe-worker/scalability branch from eabc5bc to bf270bc Compare December 21, 2025 08:11
@antoniupop antoniupop force-pushed the feature/tfhe-worker/scalability branch from bf270bc to e305ddf Compare December 21, 2025 08:32
@antoniupop antoniupop force-pushed the feature/tfhe-worker/scalability branch from 46dcec4 to 4402c9e Compare December 23, 2025 08:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants