-
Notifications
You must be signed in to change notification settings - Fork 72
Description
Describe the bug
/kind bug
What steps did you take and what happened:
[A clear and concise description of what the bug is.]
Following the docs tutorial with multipass on a Windows machine.
After deploying Onepanel with
microk8s config > kubeconfig
KUBECONFIG=./kubeconfig opctl apply
the kfserving-controller:v0.6.0 image fails to pull with an 401 Unauthorized error.
In the Onepanel UI creating a new model server like in here results in the following error:
[500] Internal error occurred: failed calling webhook "inferenceservice.kfserving-webhook-server.v1beta1.defaulter": Post "https://kfserving-webhook-server-service.kfserving-system.svc:443/mutate-serving-kubeflow-org-v1beta1-inferenceservice?timeout=30s": dial tcp 10.152.183.188:443: connect: connection refused http://serving.onepanel.pvaintern/api/namespaces/pvaonepanel/inferenceservices
I am guessing that those two error are connected.
What did you expect to happen:
The gcr.io/kfserving/kfserving-controller:v0.6.0
should be accessible. A new model server should be created.
Anything else you would like to add:
Output of microk8s.kubectl get pods/kfserving-controller-manager-0 -n kfserving-system
kfserving-system kfserving-controller-manager-0 1/2 ImagePullBackOff 1 23h
kubectl describe pod:
Events:
Type Reason Age From Message
Warning Failed 26m (x267 over 23h) kubelet Failed to pull image "gcr.io/kfserving/kfserving-controller:v0.6.0": rpc error: code = Unknown desc = failed to pull and unpack image "gcr.io/kfserving/kfserving-controller:v0.6.0": failed to resolve reference "gcr.io/kfserving/kfserving-controller:v0.6.0": pulling from host gcr.io failed with status code [manifests v0.6.0]: 401 Unauthorized
Normal Pulling 21m (x269 over 23h) kubelet Pulling image "gcr.io/kfserving/kfserving-controller:v0.6.0"
Normal BackOff 55s (x6038 over 23h) kubelet Back-off pulling image "gcr.io/kfserving/kfserving-controller:v0.6.0"
Output of microk8s.kubectl logs pod/kfserving-controller-manager-0 -n kfserving-system -c manager
Error from server (BadRequest): container "manager" in pod "kfserving-controller-manager-0" is waiting to start: trying and failing to pull image
Output of microk8s.kubectl logs pod/kfserving-controller-manager-0 -n kfserving-system -c kube-rbac-proxy
I1004 07:52:03.495440 1 main.go:209] Generating self signed cert as no cert is provided
I1004 07:52:03.666661 1 main.go:242] Listening securely on 0.0.0.0:8443
Anything else you would like to add:
Importing the docker image via microk8s ctr image import kfserving.kfserving-controller.tar
manually did not solve the problem.
According to the issues below changing the pull location from gcr.io to docker.io should help. (This where I was able to pull the image manually.)
kserve/kserve#1781
kserve/kserve#1976 (comment)
https://hub.docker.com/u/kfserving
I also tried changing line 32121 (below) in .onepanel/kubernetes.yaml from gcr.io to the docker.io and applying the changes with KUBECONFIG=./kubeconfig opctl apply
but the file was reset to its original state.
containers:
- args:
- --metrics-addr=127.0.0.1:8080
command:
- /manager
env:
- name: POD_NAMESPACE
valueFrom:
fieldRef:
fieldPath: metadata.namespace
- name: SECRET_NAME
value: kfserving-webhook-server-cert
image: gcr.io/kfserving/kfserving-controller:v0.6.0 # line 32121
imagePullPolicy: Always
name: manager
opctl version
CLI version: v1.0.2
Manifest version: v1.0.2
API version: v1.0.2
Web UI version: v1.0.2
opctl init command
opctl init --provider microk8s --enable-metallb --artifact-repository-provider s3
Kubernetes information
- Cloud provider: Microk8s
- Kubernetes version: 1.21
Machine information
- OS: Windows 10 Pro 19043.1889
- Browser: Firefox
Any help would be appreciated! Thanks :)