InferencePool: Signal Backend Protocol

Backend apps can signal the desired protocol by using [appProtocol](https://kubernetes.io/docs/concepts/services-networking/service/#application-protocol) of a Service. Since inference backends, e.g. model servers, are not expected to use a Service, InferencePool should have a similar field that Gateway implementations can use to set the appropriate protocol when routing a request.

vLLM: Supports HTTP/1.1 ([xref](https://github.com/vllm-project/vllm/issues/17695)).
Triton: Supports HTTP/1.1 (REST) and gRPC inference protocols ([xref](https://docs.nvidia.com/deeplearning/triton-inference-server/user-guide/docs/protocol/README.html)).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

InferencePool: Signal Backend Protocol #1273

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

InferencePool: Signal Backend Protocol #1273

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions