Thanks to visit codestin.com
Credit goes to github.com

Skip to content

TeleViaBox/TeleViaBox

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

50 Commits
 
 
 
 
 
 

Repository files navigation

🎉🎉 See you in CVPR'26 (Denver, USA) and AISTATS'26 (Tangier, Morocco)! 🎉🎉

CVPR 2026 Accepted: https://arxiv.org/pdf/2602.23295

DevOps CONTRIBUTIONS

1. RAG (Retrieval-Augmented Generation) for Long-Context Queries (32k paragraphs in query space) + LLM (a. Falcon LLM local serving, and b. Openai API) + Semantic Search in Vector DB (Chroma) + End-to-end Deployed on GCP cloud platform

2. Neo4j + FastAPI + Prometheus + Grafana (Observability, Maintainability) + Load Test End-to-end Deployed on AWS cloud platform + Load Testing on AWS

3. Website: Full-stack LLM conversational semantic search & chat using Vector Database with RAG

Cloud Deploy: Delivered a semantic search, chat app end-to-end (React UI, Python APIs), deployed on GCP with CI/CD. Backend Maintainability, Scalability: Built an ANN Vector Database (Chroma) over 11 novels with sub-second retrieval; evaluated quality & performance (R@5/10 0.88/0.94, MRR@10 0.82; latency p50/p95 95/220 ms; QPS 14).

4. Spotify Mega-scale system DevOps: Spotify/Pedalboard: Fixed #411.

[Method] Added 20-line PortAudio guard that raises RunTime Error when an output device vanishes,
[Issue] eliminating an infinite-block bug that could [Impact] stall Spotify’s Safe-and-Sound pipelines (used to vet 7 M+ podcasts for 696 M MAU)

5. Meta GPU DevOps: Meta-RecSys/Generative-Recommenders: Fixed #308.

[Method] Added a gin-configurable HSTU attention backend dispatcher (auto: C++ on Hopper, else logs safe fallback). [Issue] Addresses the H100 efficiency / missing integration called out. [Impact] Enables Hopper installs to use the optimized attention path while preserving public pipelines; HSTU is reported deployed across multiple Meta surfaces serving billions of users, underpins Meta’s large-scale GR stack (incl. recent context-parallel training results), and is referenced by NVIDIA’s RecSys examples.


Selected Projects

preview


Backend Stack

External Interface (B2B / B2C Users)

  • GraphQL API (HTTP): Q&A requests, text uploads, job status queries, and result fetching from frontend/web.
  • GraphQL Subscriptions: Real-time updates (job progress, streaming tokens).

Internal Communication (between backend services)

  • gRPC: Fast, type-safe calls across embedding, retrieval, generation, classification services.
  • Event Bus (SNS→SQS or Kafka/Redpanda): Async, decoupled workflows for indexing, streaming updates, long-job orchestration.

Background Task Processing

  • Celery (recommended) or RQ: Long-running tasks like indexing, embedding generation, offline analysis, batch reporting.

Persistent Storage & Dependencies

  • Chroma / Vector DB: Embedding storage for semantic retrieval (already in use).
  • Redis: Task-queue broker (Celery/RQ), cache, and pub/sub for GraphQL subscriptions.
  • (Optional) PostgreSQL: Tenants, billing, quotas, audit logs, job metadata.
  • Object Storage (S3 or equivalent): Original files, intermediate artifacts, exported results.

Recommendation System: A/B Test System Design

Business Scope (Feb 2024)

1) Metrics: Primary, Secondary, Guardrails

  • Primary: single main outcome; attributable, sensitive, stable.
    Examples: daily engagement per user, average time spent, 7-day retention.
  • Secondary: aid understanding & side effects.
    Examples: share rate, cold-start content views, creator diversity.
  • Guardrails: protect UX, performance, and business health.
    Examples: latency (p95/p99), crash rate, content quality, policy violations, ad revenue.
  • Rule: Primary decides go/no-go; guardrails prevent “winning dirty.”

2) Power & Sample Size

  • Use historical baselines + calculator/internal tooling.
  • Reduce sample size via:
    • Trigger-based sampling: include only exposed/affected users.
    • Variance reduction (CUPED): pre-experiment behavior covariates.
    • Clustering adjustment: feeds aren’t IID → real sample needs may be higher.

4) Feed-Specific Stats Considerations

  • Design: user-level randomization; trigger-based (only users who open feed); switchback tests for infra (alternate by time/geography).
  • Analysis: aggregate at user-day; cluster-robust SEs; pre-bucket users; IPW/position correction at impression level.
  • Leakage control: avoid splitting viral content/creators; for highly connected systems consider ghost experiments or community-level randomization.
Implementation Details (Jan 2024)

1) Trustworthiness: SRM, Stopping Rules, Multiplicity

  • SRM: chi-square check for expected control/treatment ratios; investigate routing/triggering if mismatched.
  • Stopping rules: pre-register window & analysis; avoid peeking without correction; use alpha-spending or Bayesian approaches.
  • Multiple comparisons: one primary metric; FDR (Benjamini–Hochberg) for secondaries/variants; bandits for exploration, confirm final rollout with traditional testing.

2) Practical Execution Plan

  1. Define goal & thresholds
    – e.g., “+1% lift in daily engagement,” MDE=+1%, α=0.05, power=0.8
  2. Estimate sample size
    – historical data → triggered users per group
  3. Randomization
    – user-level bucketing; include only triggered users; prevent creator/content leakage
  4. Variance reduction
    – use previous 7-day behavior as covariates
  5. Metrics & significance
    – aggregate at user-day; cluster-robust t-tests or regression
  6. Health checks
    – SRM; latency/crash; guardrails (harmful content, complaints)
  7. Interpretation & rollout
    – primary passed + guardrails stable; analyze heterogeneity; roll out / iterate / rollback; optional follow-up for long-term effects (e.g., 28-day retention)

Open Source Contributions

Recommendation Systems

  • Spotify/Pedalboard — Fixed #411
    Method: 20-line PortAudio guard that raises RuntimeError when an output device vanishes.
    Issue: Eliminates an infinite-block bug that could stall Spotify’s Safe-and-Sound pipelines (used to vet 7M+ podcasts for 696M MAU).

  • Meta-RecSys/Generative-Recommenders — Fixed #308
    Method: Gin-configurable HSTU attention backend dispatcher (auto→C++ on Hopper; safe fallback otherwise) without changing defaults.
    Issue: Addresses H100 efficiency / missing integration.
    Impact: Hopper gets the optimized attention path without public-pipeline changes; HSTU underpins large-scale GR incl. context-parallel and appears in NVIDIA RecSys examples.


About Me

Skill sets

  • Languages: C, C++, MATLAB, Python, Java
  • Software engineering: OOP, large-scale system scalability design
  • Hardware-adjacent: signal & image processing; encoding/decoding computation
  • Tools: VMware, Postman

Old Projects

  1. Light-weight search engine on GCP (Compute Engine) with LLM chat via RAG on Project Gutenberg novels.
    Repo
  2. Sepolia smart contract for real-estate trading + full-stack website (Flask) deployed on GCP.
    Repo

Current Projects

  1. https://github.com/TeleViaBox/vidtrans/
  2. https://github.com/TeleViaBox/vqaeffic/blob/main/README.md
  3. “not yet for public”: https://github.com/TeleViaBox/leos-vehicle-network

Not-yet Finished

  1. ML toolbox for information-space search (vector DB, loss functions)
  2. Dashboard website for easy data import & visualization
  3. Travel-spot search & visualization
  4. PageRank & reverse-connected network analysis
  5. Tabular machine & system-stability controller hyper-parameter optimization (heuristics)
  6. Course: Operating Systems — pintos-prac
  7. Course: Computer Security experiments
  8. Course: Computer Networks experiments
  9. Course: Algorithm Analysis & Design
  10. Hard problems solving — repo
  11. My understanding in computer science — repo
  12. Java: multimedia 2D Windows desktop application
  13. Data-science EDA with school course list
  14. IQA (Image) & VQA (Video) study — repo
  15. Optimization theory study — repo
  16. My understanding in Computer Network

Not-yet Started

  1. Pac-Man design
  2. WhatsApp design

Interview Notes

  • ML aspects
  • Search engine aspects
  • Algorithm aspects
  • System & network aspects: distributed systems, networking, operating systems

My best practice in DevOps: https://gist.github.com/TeleViaBox/f702ec4454e783216f072d7dd43615eb

I’m building the next generation of LLM search and practical AI systems you can actually run.

What I work on

  • 🧠 LLM Search & RAG — retrieval pipelines, vector DBs, hybrid search, agentic workflows
  • 🎥 Video IQA/VQA & streaming quality — VMAF, encoding & evaluation tooling
  • 📈 Recommendation systems — experiment design, metrics, guardrails, and trustworthy analysis
  • 🛠 Distributed backends — gRPC services, event buses, background workers
  • 🔗 Blockchain + full-stack demos — Sepolia smart contracts, GCP deployments

Reading List


Appendix: How I Prepare Repos

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors