A problem I found with my OKD4 cluster is the HAProxy statistics were claiming 5 of my 7 worker nodes were red, aka down. After some searching, I found HAProxy is reporting via the ingress router pods. Further checking of the cluster showed only two ingress routers were running. You would think this should be a daemonset so the ingress router would be available on every worker. Two seems sufficient however, for example I have three physical hosts that are running my OKD4 cluster. If I have 7 workers spread across the three hosts and the two router pods are on one host and the host fails, then the application that uses the ingress router will need to wait until OKD4 realizes they’re gone and spins up two more ingress router pods.
At first I figured it was the deployment that needed to be updated. However updating the deployment replicas from 2 to 7 failed. The number of replicas reverted back to 2.
After some hunting, I found the solution. You actually have to patch the ingress operator not the deployment.
oc patch -n openshift-ingress-operator ingresscontroller/default --patch '{"spec":{"replicas": 7}}' --type=merge
And success. Now there are 7 ingress pods running on my cluster.
openshift-ingress router-default-6b8b455c59-56gk5 1/1 Running 0 16d
openshift-ingress router-default-6b8b455c59-6z678 1/1 Running 0 16d
openshift-ingress router-default-6b8b455c59-dhrgx 1/1 Running 0 16d
openshift-ingress router-default-6b8b455c59-kgs5n 1/1 Running 0 16d
openshift-ingress router-default-6b8b455c59-ngvdx 1/1 Running 2 16d
openshift-ingress router-default-6b8b455c59-t8zmd 1/1 Running 0 16d
openshift-ingress router-default-6b8b455c59-wbh2z 1/1 Running 0 16d
References
- https://access.redhat.com/solutions/5393521 – You need a Red Hat account to access this page.
- https://docs.openshift.com/container-platform/4.9/networking/ingress-operator.html#nw-ingress-controller-configuration_configuring-ingress – Openshift Documentation