-
Notifications
You must be signed in to change notification settings - Fork 8.2k
Support multiple targetPorts on an InferencePool #58238
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
e795eb0 to
d092e37
Compare
Implements the EPP protocol. It accepts an X-Endpoint header that allows the integration test to determine which endpoint will be picked by the EPP. Co-Authored-By: Claude <[email protected]>
I planned on updating to v1.1.0 but ran into dependency issues. Now pointing to the commit that loosened restrictions on number of targetPorts.
f74458c to
f7b544c
Compare
156176f to
8335b5b
Compare
| } | ||
|
|
||
| // Fallback if no targetPorts specified | ||
| if len(ports) == 0 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since inferPool's TargetPorts cannot be empty, is this check necessary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's correct- I'll remove the check
| var out []*IstioEndpoint | ||
|
|
||
| // For InferencePool services, return ALL endpoints regardless of port | ||
| // because they may have different target ports but belong to the same cluster |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe i missed some ctx, since the epp is responsible for lb, why do istio still need to generate all endpoints with all target ports?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the EPP will just return an endpoint to envoy, but for envoy to be able to route to that endpoint, it must exist in the envoy cluster. that's why we need to combine them here, so that they end up in the same envoy cluster
|
/cc |
This adds support for multiple targetPorts in an InferencePool by adding all targetPorts to the shadow service, and then making sure that only a single cluster is created for the dummy port (54321), allowing the EPP to loadbalance across all endpoints.
|
@dgn: The following test failed, say
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
Please provide a description of this PR:
This adds support for multiple targetPorts on an InferencePool by merging all endpoints across all targetPorts into a single Envoy cluster, so that the EPP can load-balance across them. The ability to configure multiple
targetPortsin an InferencePool was introduced in GIE v1.1.0.I also used Claude Code to generate an EPP mock that I'm using in the integration test. It accepts an
X-Endpointheader that allows the test to predetermine which endpoint should be selected.It's become a quite large PR so I split it into multiple smaller commits to make reviews easier.
Fixes #57638