Overview
The metrics server collects metrics from your kubernetes cluster. It’s also used by the Horizontal Pod Autoscaling (HPA) function to let you manage pods.
Installation
For my clusters, it’s a pretty simple configuration. I retrieve the components.yaml file from the metrics-server github site (see References below), compare it with the previous version if any, retrieve the images, tag, and push them to the local repository, then update the components.yaml file to point to the local repository. When done, simply apply it to the cluster.
kubectl apply -f components.yaml
Issue
I found that if I add the line to the KubeletConfiguration block when initializing the cluster, it’ll be added to the appropriate config.yaml files and the kubelet-config configmap in the cluster. I will leave this here as a reminder in case it pops up.
There is one issue that has to be addressed. See the References section for a link. Basically one of the flags indicates a preference for using IP Addresses for pods before external IPs or hostnames. Since IP Addresses weren’t part of the cluster build, metrics-server won’t start, generating tons of certificate errors. Of course you can move Hostname to the front of the line but you’re adding a DNS lookup to your list of tasks. You can also add a ignore tls flag which of course isn’t secure.
kube-system metrics-server-5597479f8d-fn8xm 0/1 Running 0 13h
What to do?
First you’ll need to edit the kubelet-config configmap and add serverTLSBootstrap: true right after the kind: KubeletConfiguration line and save it.
$ kubectl edit configmap kubelet-config -n kube-system
configmap/kubelet-config edited
Next you’ll have to edit every control node and worker node’s /var/lib/kubelet/config.yaml file and add the same line at the same place and restart kubelet.
Finally Certificate Requests (csr) will be created for each node. You’ll need to approve each CSR.
$ kubectl get csr
NAME AGE SIGNERNAME REQUESTOR REQUESTEDDURATION CONDITION
csr-4kr8m 20s kubernetes.io/kubelet-serving system:node:bldr0cuomkube3.dev.internal.pri <none> Pending
csr-fqpvs 28s kubernetes.io/kubelet-serving system:node:bldr0cuomknode3.dev.internal.pri <none> Pending
csr-m526d 27s kubernetes.io/kubelet-serving system:node:bldr0cuomkube2.dev.internal.pri <none> Pending
csr-nc6t7 27s kubernetes.io/kubelet-serving system:node:bldr0cuomkube1.dev.internal.pri <none> Pending
csr-wxhfd 28s kubernetes.io/kubelet-serving system:node:bldr0cuomknode1.dev.internal.pri <none> Pending
csr-z42x4 28s kubernetes.io/kubelet-serving system:node:bldr0cuomknode2.dev.internal.pri <none> Pending
$ kubectl certificate approve csr-4kr8m
certificatesigningrequest.certificates.k8s.io/csr-4kr8m approved
During this process, if you’re monitoring the pods, you’ll see the metrics-server start. It’s because you’ve approved the csr on the node where the metrics-server is running. Make sure you do all the servers.
kube-system metrics-server-5597479f8d-fn8xm 1/1 Running 0 13h
Issue
I’m still working through this but if I start the metrics-server before or after Calico is started, it requires the pod to be deleted to actually get metrics. Will try a few more installations to see if I can identify just exactly when the metrics-server should be started.
References
- https://github.com/kubernetes-sigs/metrics-server
- https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/#kubelet-serving-certs
- https://github.com/kubernetes-sigs/metrics-server/issues/196 – This helped me resolve the issue by mainly pointing to the actual docs but there are good troubleshooting info here.
Pingback: Kubernetes Index | Motorcycle Touring