Thanks to visit codestin.com
Credit goes to github.com

Skip to content
View R3hankhan123's full-sized avatar

Highlights

  • Pro

Block or report R3hankhan123

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
R3hankhan123/README.md

Hey πŸ‘‹, I'm Rehan Khan

πŸ”§ I build systems that run models not demo notebooks that break outside Jupyter


🧠 What I Actually Do

  • πŸš€ Make LLMs run on spyre accelerator and CPUs (yes CPUs)
  • πŸ”© Bend vLLM into places it wasn't designed for
  • πŸ—οΈ Fix build systems across architectures
  • πŸ“¦ Make multi-arch containers actually behave

πŸ”­ Currently Building

  • ⚑ CPU-only LLM inference pipelines
  • πŸ›οΈ s390x(cpu) and spyre support for modern ML stacks
  • ☸️ Infra that works beyond a single machine

βš™οΈ Current Obsessions

πŸ”₯ squeezing every drop of performance.
⚠️  making PyTorch do questionable things
☸️  running clean infra on Kubernetes

πŸ› οΈ Languages & Tools


πŸ“Š GitHub Stats

r3hankhan123

r3hankhan123


🀝 Connect With Me

linkedin


Pinned Loading

  1. containerd containerd Public

    Forked from containerd/containerd

    An open and reliable container runtime

    Go

  2. torch-spyre torch-spyre Public

    Forked from torch-spyre/torch-spyre

    C++

  3. pytorch/pytorch pytorch/pytorch Public

    Tensors and Dynamic neural networks in Python with strong GPU acceleration

    Python 100k 27.9k

  4. vllm-project/vllm vllm-project/vllm Public

    A high-throughput and memory-efficient inference and serving engine for LLMs

    Python 80.8k 17.1k