Skip to content

InferencePool: Signal Backend Protocol #1273

@danehans

Description

@danehans

Backend apps can signal the desired protocol by using appProtocol of a Service. Since inference backends, e.g. model servers, are not expected to use a Service, InferencePool should have a similar field that Gateway implementations can use to set the appropriate protocol when routing a request.

vLLM: Supports HTTP/1.1 (xref).
Triton: Supports HTTP/1.1 (REST) and gRPC inference protocols (xref).

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs-triageIndicates an issue or PR lacks a `triage/foo` label and requires one.

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions