Codestin Search App

ATC25 Colocating ML Inference and Training with Fast GPU Memory Handover

今天yf来分享一篇来自IPADS的ATC25文章。 Colocating ML Inference and Training with Fast GPU Memory Handover 简短点评：依旧IPADS特有的大工程，TVM+vLLM+NCCL+Pytorch 开组会大家一起问了很多问题。 https://ipads.se.sjtu.edu.cn/_media/publications/si

Paper Reading
Haibin
2026-01-15
52 Views
0 Comments

STOC81 I/O Complexity: The Red-Blue Pebble Game

STOC81 I/O Complexity: The Red-Blue Pebble Game 这是一篇理论计算机科学文章，但是描述了一个非常有趣的问题：就像时间复杂度一样，我们能不能做一个I/O复杂度，衡量一个程序最少要进行多少次I/O? 文章链接： https://www.eecs.harvard.edu/~htk/publication/1981-stoc-hong-kung.pdf Com

Paper Reading
Haibin
2026-01-09
69 Views
0 Comments

In-depth analysis: RetroInfer: A Vector-Storage Approach for Scalable Long-Context LLM Inference

之前用LLM看文章，后来发现同样20分钟时间，学到的东西其实不如自己认真读读+关键问题请教。 KVCache可以用上 RAG 技术吗？这篇文章的idea是：能不能 "build KVCache as a Vector Storage System." 在长上下文情况中，KVCache经常超出显存，那么我们只能把多余的KVCache存进CPU内存里。而这样就很慢（CPU-GPU

Paper Reading
Haibin
2026-01-08
81 Views
0 Comments

Task-based Parallelism models and their techniques Overivew

So far there are many task programming models. Charm++ Website: https://charmplusplus.org/applications/ Github: https://github.com/charmplusplus/charm Tutorial: https://charm.readthedocs.io/en/latest/

High Performance Computing
Haibin
2026-01-07
49 Views
0 Comments

Distributed and Cloud Computing Assignment 4

Feedback Feedback to Learner 12/30/25 3:55 PM 82+5=87 (extra: 0) > Summary: As we demonstrated in the lab, you should pre-assign labels and taints to cluster nodes using Kind config YAML. Other parts

Distributed Systems
Haibin
2026-01-07
54 Views
0 Comments

America Against America

美国反对美国第一次系统性思考美国，始于高中时读林达夫妇的《历史深处的忧虑》。后来又看了托克维尔的《论美国的民主》。再然后到今天 Hu\'ning Wang 的《美国反对美国》。几个来自不同时间、不同国籍、不同立场的作者在他们的书里，对美国政治、经济、文化进行了多方面的观察。于我而言，我也从纪录片到真正踏上这块陌生的土地过上半年生活。再重新思考书里提到的一切，有了很多新的体验。同样是游历半年，王考

Books Reading
Haibin
2026-01-04
179 Views
0 Comments

Learn Compilers in 6 hours

半个学期在申请，半个学期在忙paper，基本上没怎么动过这个课程。但是课程整体考试是不难的。高效“备考” 周一下午16:30的考试，我从周一凌晨3点开始学，早上9点结束。睡5小时下午2点起床吃早餐+洗澡，然后考试，69/100。反正大四了，过了就行，分数就图一乐。这个人讲的最好，古希腊掌管编译器的神明。只讲题目，全程干货。【【武汉大学】编译原理混子速成——面向期末试卷复习：全集】 http

Compilers
Haibin
2025-12-30
95 Views
1 Comments

Distributed Systems and Cloud Computing: Review 1

This is the self-review pack of Distributed Systems and Cloud Computing. We have lesson 1-5. Lesson 1 Presentation – Effective communication of information rather than of data – Code and number conver

Distributed Systems
Haibin
2025-12-30
170 Views
0 Comments

DnCC3: Introduction to Spark

In this assignment, we need to use Spark to analyze the Parking dataset. Preparing Install pysark and java pip install pyspark sudo apt-get update sudo apt-get install openjdk-17-jdk export JAVA_HOME=

Distributed Systems
Haibin
2025-12-30
94 Views
0 Comments

DnCC Assignment 1: Parallel Matrix Multiplication

https://github.com/HaibinLai/Distributed-and-Cloud-Computing.git 【分布与云计算 - DnCC 复习】 https://www.bilibili.com/video/BV1eovaBTEW9/?share_source=copy_web&vd_source=72eac555730ba7e7a64f9fa1d7f2b2d4 Setup

Distributed Systems
Haibin
2025-12-30
103 Views
0 Comments

A Simple Merch Store Backend: Distributed and Cloud Computing Assignment 2

Scores 95+10=105 (extra: 5) Summary: The impl is nice in general, and the report is awesome! Yes, this is an assignment where you should follow certain instructions and submit certain stuff, but just

Distributed Systems
Haibin
2025-12-30
199 Views
0 Comments

You and your research | Richard W. Hamming

你和你的研究 https://gwern.net/doc/science/1986-hamming Great work is something else than mere brains. Brains are measured in various ways. In mathematics, theoretical physics, astrophysics, typically brain

Paper Reading
Haibin
2025-12-30
64 Views
0 Comments

MoonshotAI: Sharing for VibeCoding Examples and Debug Techniques

Vibe coding Meetup北京场｜VibeCoding案例和Debug技巧 https://www.douyin.com/video/7543627062267923747 这个视频记录了月之暗面Kimi对vibe coding的分享。软件工程：没有银弹 -> AI? 现在AI可以跑几十分钟，处理大量的数据+代码 windsurf 收购 Claude Code | Cursor

Academic
Haibin
2025-12-04
132 Views
0 Comments

The Old Man and the Sea 劳而不获

《老人与海》发生在上世纪的古巴，那个离我和我的世界很远的地方。一个渔夫钓上一条大鱼，随后与风暴中的鲨鱼搏斗，最终鱼肉都被它们啃食干净，只带回了鱼骨。高中的我很不解。一个拼尽全力却无功而返的故事，听起来没什么意义。我不解老人到底在想什么，为何要去进行一场毫无意义的搏斗。简直和堂吉柯德一样，执拗而又带有一点愚笨和悲哀。老人与海。有人说这是他见过最不对等的两个事物放在同一个标题里。老人有什么能力跟大

Books Reading
Haibin
2025-11-28
259 Views
0 Comments

怎么用AI写2000行的大作业

最近分布式课程有一个作业。作业内容是要写一个商城的后端。商城消费者通过网页API访问/消费商品，后端商品数据库有CRUD、产品消费消息订阅服务。分解开来，要有OpenAPI Service后端接口服务，Database Service数据库服务和logging Service日志服务，3个微服务全上docker，工程量2000-3000行python。这个作业在我与GPT、Deepseek的配合

Computer Science
Haibin
2025-11-16
413 Views
0 Comments

我在CPU修PMU：Can We Trust Profiling Results?

Can We Trust Profiling Results? Understanding and Fixing the Inaccuracy in Modern Profilers https://par.nsf.gov/servlets/purl/10122098 在上次阅读完博客 # Where Do Interrupts Happen? 后（我的中文解析：https://www.haibi

Paper Reading
Haibin
2025-11-11
289 Views
0 Comments

AI Compiler Group Meeting

109 pages PPT，from TVM to Mirage. Introducing AI Compiler 101. Cost 90 minutes. PPT and videos： https://drive.google.com/drive/folders/1eKcHZKMpix31EcioiNCf16AzLIHkvGyy?usp=sharing

Paper Reading
Haibin
2025-11-11
217 Views
0 Comments

现在的学生缺乏对大型工程动刀子的能力

我的接触的同学和我的观察力都比较有限。但是在这些天做研究、跟朋友聊科研，问大家迷茫/卡在哪，都感受到这一点。现在的学生缺乏对大型工程动刀子的能力。 XX github proj 跑不起来。编译/安装阶段超过10条命令就束手无策。超过30页的文档找不到对应的命令。甚至连问GPT都问不明白 ...... 以前总有学生吐槽说，实验室里学长不愿意带学生。但是没有基础实在是太难带了，简直是拖累节奏，并且心

Books Reading
Haibin
2025-11-07
295 Views
0 Comments

神文解析：AVX 是怎么让你的CPU频率更慢的？

GB！本文依旧是超神作者 Travis Downs https://x.com/trav_downs 的技术博客解读。文章链接 Gathering Intel on Intel AVX-512 Transitions https://travisdowns.github.io/blog/2020/01/17/avxfreq1.html 本文是在其基础上的分析与解读，若内容涉及侵权，请与我联系，我

文章翻译
Haibin
2025-11-06
618 Views
0 Comments