-
Notifications
You must be signed in to change notification settings - Fork 24
Document GKE TPU Performance #133
Conversation
|
|
||
| Another important factor is the capacity of DraNet to pass Interface configuration options that allow to tune the interfaces for maximume performance, per example, [Big TCP](https://lwn.net/Articles/884104/). | ||
|
|
||
| In addition, if you have GVNIC enabled you can use some private ethtool flags that improve the performance for TCP like [enable-max-rx-buffer-size](enable-max-rx-buffer-size). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we explain what these flags do and why we are setting these values somewhat? Especially the private flags.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I was trying to find some public docs to reference it but only found this https://github.com/GoogleCloudPlatform/compute-virtual-ethernet-linux/blob/d4e3772aea0fec953f33f1776eea33f8e9d9e2ee/build/gve_ethtool.c#L87
or https://github.com/search?q=enable-max-rx-buffer-size&type=code
Change-Id: Ic2cbf3ce0a4f40932268050db7f1d3ff3053429f
Change-Id: I535e6deebb36a861c32ded91f33549815e7f0275
| gcloud compute --project=${PROJECT?} \ | ||
| networks subnets create \ | ||
| tpu-net-2-sub \ | ||
| --network=tpu-net-1 \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@aojea this should be tpu-net-2
|
It would be helpful to have a before and after picture. with hostNetwork=true vs with dranet. I suspect dranet perf is actually much better. |
It turns out some process like cilium pin their program to the bpf filesystem so we need to delete them to be able to remove the bpf programs, or we'll not be able to detach them because they are still referenced.
Add documentation about how to maximize TCP throughput on TPU v6 machines, by using two virtual interfaces that map to the two physical interfaces of the physical VM, @samos123 you'll be interested on this