Handles ImagePullBackOff errors that occur when installing metrics-server

In earlier versions of minikube (kubernetes), heapster was used to monitor clusters, and since version 1.8, it has gradually been upgraded to use metrics-server for resource monitoring. Installing metrics-server is simple, but it often leads to ImagePullBackOff errors due to network issues.

Error behavior

In minikube, only a simple execution is required to activate metrics-server:

1	minikube addons enable metrics-server

The system will feedback:

1	* The 'metrics-server' addon is enabled

Execute the list command again:

1	minikube addons list

It can also be seen that metrics-server has been activated:

|-----------------------------|----------|--------------|
|         ADDON NAME          | PROFILE  |    STATUS    |
|-----------------------------|----------|--------------|
| dashboard                   | minikube | enabled ✅   |
| default-storageclass        | minikube | enabled ✅   |
| efk                         | minikube | disabled     |
| freshpod                    | minikube | disabled     |
| gvisor                      | minikube | disabled     |
| helm-tiller                 | minikube | disabled     |
| ingress                     | minikube | disabled     |
| ingress-dns                 | minikube | disabled     |
| istio                       | minikube | disabled     |
| istio-provisioner           | minikube | disabled     |
| logviewer                   | minikube | disabled     |
| metrics-server              | minikube | enabled ✅   |
| nvidia-driver-installer     | minikube | disabled     |
| nvidia-gpu-device-plugin    | minikube | disabled     |
| registry                    | minikube | disabled     |
| registry-creds              | minikube | disabled     |
| storage-provisioner         | minikube | enabled ✅   |
| storage-provisioner-gluster | minikube | disabled     |
|-----------------------------|----------|--------------|

But if we execute the top command：

1	kubectl top pod

What you get is the following error message:

1	Error from server (ServiceUnavailable): the server is currently unable to handle the request (get nodes.metrics.k8s.io)

Note that metrics-server is not actually installed. WHY？

By executing the get pod command we look for the reason:

1	kubectl get pod -n kube-system

The following results can be obtained:

NAME                                   READY   STATUS             RESTARTS   AGE
pod/coredns-6955765f44-5wsjv           1/1     Running            0          21m
pod/coredns-6955765f44-lzkq2           1/1     Running            0          21m
pod/etcd-minikube                      1/1     Running            0          21m
pod/kube-apiserver-minikube            1/1     Running            0          21m
pod/kube-controller-manager-minikube   1/1     Running            0          21m
pod/kube-proxy-dfd7m                   1/1     Running            0          21m
pod/kube-scheduler-minikube            1/1     Running            0          21m
pod/metrics-server-6754dbc9df-lhp9p    0/1     ImagePullBackOff   0          19m
pod/storage-provisioner                1/1     Running            2          3d5h

As you can see, the metrics-server corresponding POD did not start successfully and is now in the ImagePullBackOff state。

Execute the describe command further to view the events:

1	describe pod metrics-server-6754dbc9df-lhp9p -n kube-system

The system display:

  Type     Reason     Age                From               Message
  ----     ------     ----               ----               -------
  Normal   Scheduled  <unknown>          default-scheduler  Successfully assigned kube-system/metrics-server-6754dbc9df-
f5pl8 to minikube
  Normal   BackOff    38s                kubelet, minikube  Back-off pulling image "k8s.gcr.io/metrics-server-amd64:v0.2
.1"
  Warning  Failed     38s                kubelet, minikube  Error: ImagePullBackOff
  Normal   Pulling    24s (x2 over 54s)  kubelet, minikube  Pulling image "k8s.gcr.io/metrics-server-amd64:v0.2.1"
  Warning  Failed     9s (x2 over 38s)   kubelet, minikube  Failed to pull image "k8s.gcr.io/metrics-server-amd64:v0.2.1
": rpc error: code = Unknown desc = Error response from daemon: Get https://k8s.gcr.io/v2/: net/http: request canceled w
hile waiting for connection (Client.Timeout exceeded while awaiting headers)
  Warning  Failed     9s (x2 over 38s)   kubelet, minikube  Error: ErrImagePull

This is clearer because the k8s.gcr.io cannot be accessed, resulting in the image not being pulled: metrics-server-amd64:v0.2.1 ，This results in the required POD not starting correctly。

Manually pull the image

To solve this problem, we first need to pull to the mirror. Since we can’t access k8s.gcr.io, we download it from the country and execute the following command at once:

1	minikube ssh

Pull images from Alibaba’s repository

1	docker pull registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server-amd64:v0.2.1

Tag the image

1	docker tag registry.cn-hangzhou.aliyuncs.com/google_containers/metrics-server-amd64:v0.2.1 k8s.gcr.io/metrics-server-amd64:v0.2.1

This results in a proper mirror image of metrics-server in minikube。

Modify the deployment file for metrics-server

After you have an image locally, you will find that minikube still needs to take the image from k8s.gcr.io, and you need to modify the deployment file of metrics-server.

Execute:

1	kubectl -n kube-system edit deployment metrics-server

After execution, the metrics-server file will open in your system’s default editor，Such as:

# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: apps/v1
kind: Deployment
metadata:
  annotations:
    deployment.kubernetes.io/revision: "1"
    kubectl.kubernetes.io/last-applied-configuration: |
      {"apiVersion":"apps/v1","kind":"Deployment","metadata":{"annotations":{},"labels":{"addonmanager.kubernetes.io/mode":"Reconcile","k8s-app":"metrics-server","kubernetes.io/minikube-addons":"metrics-server"},"name":"metrics-server","namespace":"kube-system"},"spec":{"selector":{"matchLabels":{"k8s-app":"metrics-server"}},"template":{"metadata":{"labels":{"k8s-app":"metrics-server"},"name":"metrics-server"},"spec":{"containers":[{"command":["/metrics-server","--source=kubernetes.summary_api:https://kubernetes.default?kubeletHttps=true\u0026kubeletPort=10250\u0026insecure=true"],"image":"k8s.gcr.io/metrics-server-amd64:v0.2.1","imagePullPolicy":"Always","name":"metrics-server"}]}}}}
  creationTimestamp: "2020-03-24T04:05:19Z"
  generation: 1
  labels:
    addonmanager.kubernetes.io/mode: Reconcile
    k8s-app: metrics-server
    kubernetes.io/minikube-addons: metrics-server
  name: metrics-server
  namespace: kube-system
  resourceVersion: "196126"
  selfLink: /apis/apps/v1/namespaces/kube-system/deployments/metrics-server
  uid: db2ef8a7-df5a-4787-9910-08eb87b85bb6
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      k8s-app: metrics-server
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        k8s-app: metrics-server
      name: metrics-server
    spec:
      containers:
      - command:
        - /metrics-server
        - --source=kubernetes.summary_api:https://kubernetes.default?kubeletHttps=true&kubeletPort=10250&insecure=true
        image: k8s.gcr.io/metrics-server-amd64:v0.2.1
        imagePullPolicy: Always
        name: metrics-server
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30
status:
  conditions:
  - lastTransitionTime: "2020-03-24T04:05:19Z"
    lastUpdateTime: "2020-03-24T04:05:19Z"
    message: Deployment does not have minimum availability.
    reason: MinimumReplicasUnavailable
    status: "False"
    type: Available
  - lastTransitionTime: "2020-03-24T04:15:20Z"
    lastUpdateTime: "2020-03-24T04:15:20Z"
    message: ReplicaSet "metrics-server-6754dbc9df" has timed out progressing.
    reason: ProgressDeadlineExceeded
    status: "False"
    type: Progressing
  observedGeneration: 1
  replicas: 1
  unavailableReplicas: 1
  updatedReplicas: 1

On line 47 of the file, you can see that the current pull mode is: Always, which means that it will fetch from k8s.gcr.io regardless of whether there is already an image locally. Change the policy to: IfNotPresent, let the system take precedence over local mirrors.

After saving the file, the system will automatically update the POD, there is no need to re-enable this addon.

After you complete the next steps, you are ready to execute the top command:

1	kubectl top pod

In my environment, display:

1 2	NAME CPU(cores) MEMORY(bytes) redis-ha-1584698965-server-0 2m 4Mi

Of course, the metrics-server deployment is activated to execute this simple top command, but to get more monitoring information through the metrics API, and to configure scaling applications that are automated based on conditions.