ArgoCD CLI Commands

Overview

This article lists argocd CLI commands that were used to review and manage the ArgoCD installation. A lot of times finding useful commands isn’t easy. This article lists the more commands I used when getting things set up.

Help

Once you log in, simply typing argocd will give you a pretty complete list of commands and flags you can use. Generally I’m checking my project and applications but other commands are available.

Available Commands:
  account     Manage account settings
  admin       Contains a set of commands useful for Argo CD administrators and requires direct Kubernetes access
  app         Manage applications
  appset      Manage ApplicationSets
  cert        Manage repository certificates and SSH known hosts entries
  cluster     Manage cluster credentials
  completion  output shell completion code for the specified shell (bash or zsh)
  context     Switch between contexts
  gpg         Manage GPG keys used for signature verification
  help        Help about any command
  login       Log in to Argo CD
  logout      Log out from Argo CD
  proj        Manage projects
  relogin     Refresh an expired authenticate token
  repo        Manage repository connection parameters
  repocreds   Manage repository connection parameters
  version     Print version information

Logging In

As noted in a prior article, there were a few issues with logging in to the cluster and you have to log in in order to use the CLI tool. Mainly because I didn’t set up a set of certificates, I needed to use the insecure flag to log in.

$ argocd login argocd.dev.internal.pri --insecure
WARN[0000] Failed to invoke grpc call. Use flag --grpc-web in grpc calls. To avoid this warning message, use flag --grpc-web.
Username: admin
Password:
'admin:login' logged in successfully
Context 'argocd.dev.internal.pri' updated

And I’m in. From here I can now manage my ArgoCD projects. Since I use gitops to manage my Kubernetes clusters, the commands I use here generally are getting information and not setting up the project. See my github repo for those configurations.

Project Information

First off, you need to know what projects are installed. There is always a default project. The two projects I have installed are the blue and green projects.

First commands.

Available Commands:
  add-destination          Add project destination
  add-orphaned-ignore      Add a resource to orphaned ignore list
  add-signature-key        Add GnuPG signature key to project
  add-source               Add project source repository
  allow-cluster-resource   Adds a cluster-scoped API resource to the allow list and removes it from deny list
  allow-namespace-resource Removes a namespaced API resource from the deny list or add a namespaced API resource to the allow list
  create                   Create a project
  delete                   Delete project
  deny-cluster-resource    Removes a cluster-scoped API resource from the allow list and adds it to deny list
  deny-namespace-resource  Adds a namespaced API resource to the deny list or removes a namespaced API resource from the allow list
  edit                     Edit project
  get                      Get project details
  list                     List projects
  remove-destination       Remove project destination
  remove-orphaned-ignore   Remove a resource from orphaned ignore list
  remove-signature-key     Remove GnuPG signature key from project
  remove-source            Remove project source repository
  role                     Manage a project's roles
  set                      Set project parameters
  windows                  Manage a project's sync windows

Then you can use the list command to view the projects.

$ argocd proj list
NAME          DESCRIPTION                                 DESTINATIONS    SOURCES                                                    CLUSTER-RESOURCE-WHITELIST  NAMESPACE-RESOURCE-BLACKLIST  SIGNATURE-KEYS  ORPHANED-RESOURCES
default                                                   *,*             *                                                          */*                         <none>                        <none>          disabled
llamas-blue   Project to install the llamas band website  4 destinations  git@lnmt1cuomgitlab.internal.pri/external-unix/gitops.git  */*                         <none>                        <none>          disabled
llamas-green  Project to install the llamas band website  4 destinations  git@lnmt1cuomgitlab.internal.pri/external-unix/gitops.git  */*                         <none>                        <none>          disabled

If I wanted to check out the details of a project, I’d run the get command:

$ argocd proj get llamas-blue
Name:                        llamas-blue
Description:                 Project to install the llamas band website
Destinations:                https://kubernetes.default.svc,llamas-blue
                             https://cabo0cuomvip1.qa.internal.pri:6443,llamas-blue
                             https://tato0cuomvip1.stage.internal.pri:6443,llamas-blue
                             https://lnmt1cuomvip1.internal.pri:6443,llamas-blue
Repositories:                git@lnmt1cuomgitlab.internal.pri/external-unix/gitops.git
Scoped Repositories:         <none>
Allowed Cluster Resources:   */*
Scoped Clusters:             <none>
Denied Namespaced Resources: <none>
Signature keys:              <none>
Orphaned Resources:          disabled

If you compare, the only real difference between the two is the get command lists the remote K8S clusters vs just indicating there are 4.

Application Information

The main thing I’m checking is the application status. Let’s see the options first.

Available Commands:
  actions         Manage Resource actions
  create          Create an application
  delete          Delete an application
  delete-resource Delete resource in an application
  diff            Perform a diff against the target and live state.
  edit            Edit application
  get             Get application details
  history         Show application deployment history
  list            List applications
  logs            Get logs of application pods
  manifests       Print manifests of an application
  patch           Patch application
  patch-resource  Patch resource in an application
  resources       List resource of application
  rollback        Rollback application to a previous deployed version by History ID, omitted will Rollback to the previous version
  set             Set application parameters
  sync            Sync an application to its target state
  terminate-op    Terminate running operation of an application
  unset           Unset application parameters
  wait            Wait for an application to reach a synced and healthy state

For that I’d run the following list command:

$ argocd app list
NAME                       CLUSTER                                        NAMESPACE     PROJECT       STATUS  HEALTH   SYNCPOLICY  CONDITIONS  REPO                                                       PATH                 TARGET
argocd/llamas-blue-dev     https://kubernetes.default.svc                 llamas-blue   llamas-blue   Synced  Healthy  Auto        <none>      git@lnmt1cuomgitlab.internal.pri/external-unix/gitops.git  dev/llamas-blue/     dev
argocd/llamas-blue-prod    https://lnmt1cuomvip1.internal.pri:6443        llamas-blue   llamas-blue   Synced  Healthy  Auto        <none>      git@lnmt1cuomgitlab.internal.pri/external-unix/gitops.git  prod/llamas-blue/    main
argocd/llamas-blue-qa      https://cabo0cuomvip1.qa.internal.pri:6443     llamas-blue   llamas-blue   Synced  Healthy  Auto        <none>      git@lnmt1cuomgitlab.internal.pri/external-unix/gitops.git  qa/llamas-blue/      main
argocd/llamas-blue-stage   https://tato0cuomvip1.stage.internal.pri:6443  llamas-blue   llamas-blue   Synced  Healthy  Auto        <none>      git@lnmt1cuomgitlab.internal.pri/external-unix/gitops.git  stage/llamas-blue/   main
argocd/llamas-green-dev    https://kubernetes.default.svc                 llamas-green  llamas-green  Synced  Healthy  Auto        <none>      git@lnmt1cuomgitlab.internal.pri/external-unix/gitops.git  dev/llamas-green/    dev
argocd/llamas-green-prod   https://lnmt1cuomvip1.internal.pri:6443        llamas-green  llamas-green  Synced  Healthy  Auto        <none>      git@lnmt1cuomgitlab.internal.pri/external-unix/gitops.git  prod/llamas-green/   main
argocd/llamas-green-qa     https://cabo0cuomvip1.qa.internal.pri:6443     llamas-green  llamas-green  Synced  Healthy  Auto        <none>      git@lnmt1cuomgitlab.internal.pri/external-unix/gitops.git  qa/llamas-green/     main
argocd/llamas-green-stage  https://tato0cuomvip1.stage.internal.pri:6443  llamas-green  llamas-green  Synced  Healthy  Auto        <none>      git@lnmt1cuomgitlab.internal.pri/external-unix/gitops.git  stage/llamas-green/  main

There is a lot of information here in part because my ArgoCD instance is connected and manages applications on four Kubernetes clusters. So you’ll see a blue and green application four times each.

Getting details though provides a ton of information.

$ argocd app get argocd/llamas-blue-dev
Name:               argocd/llamas-blue-dev
Project:            llamas-blue
Server:             https://kubernetes.default.svc
Namespace:          llamas-blue
URL:                https://argocd.dev.internal.pri/applications/llamas-blue-dev
Repo:               git@lnmt1cuomgitlab.internal.pri/external-unix/gitops.git
Target:             dev
Path:               dev/llamas-blue/
SyncWindow:         Sync Allowed
Sync Policy:        Automated
Sync Status:        Synced to dev (1ef2090)
Health Status:      Healthy

GROUP                      KIND                     NAMESPACE    NAME                        STATUS   HEALTH   HOOK  MESSAGE
                           ResourceQuota            llamas-blue  llamas-rq                   Synced                  resourcequota/llamas-rq unchanged
                           LimitRange               llamas-blue  llamas-lr                   Synced                  limitrange/llamas-lr unchanged
                           ServiceAccount           llamas-blue  cschelin-admin              Synced                  serviceaccount/cschelin-admin unchanged
                           ServiceAccount           llamas-blue  cschelin                    Synced                  serviceaccount/cschelin unchanged
rbac.authorization.k8s.io  ClusterRoleBinding       llamas-blue  cschelin-view-llamas-blue   Running  Synced         clusterrolebinding.rbac.authorization.k8s.io/cschelin-view-llamas-blue reconciled. clusterrolebinding.rbac.authorization.k8s.io/cschelin-view-llamas-blue unchanged
rbac.authorization.k8s.io  ClusterRoleBinding       llamas-blue  cschelin-admin-llamas-blue  Running  Synced         clusterrolebinding.rbac.authorization.k8s.io/cschelin-admin-llamas-blue reconciled. clusterrolebinding.rbac.authorization.k8s.io/cschelin-admin-llamas-blue unchanged
                           Service                  llamas-blue  llamas                      Synced   Healthy        service/llamas unchanged
autoscaling                HorizontalPodAutoscaler  llamas-blue  llamas                      Synced   Healthy        horizontalpodautoscaler.autoscaling/llamas unchanged
networking.k8s.io          Ingress                  llamas-blue  llamas                      Synced   Healthy        ingress.networking.k8s.io/llamas configured
argoproj.io                Rollout                  llamas-blue  llamas                      Synced   Healthy        rollout.argoproj.io/llamas unchanged
rbac.authorization.k8s.io  ClusterRoleBinding                    cschelin-admin-llamas-blue  Synced
rbac.authorization.k8s.io  ClusterRoleBinding                    cschelin-view-llamas-blue   Synced



Posted in CI/CD, Computers | Tagged , , | 1 Comment

Continuous Delivery With ArgoCD

Overview

This article provides instructions in installing and configuring ArgoCD in Kubernetes.

Installation

The main task here is that Openshift is using ArgoCD so we should be familiar with how ArgoCD works.

Images

Installation-wise, it’s pretty easy. There are a couple of changes you’ll need to do. First off is review the install.yaml file to see what images will be loaded. Bring them in to the local repository following those instructions then update the install.yaml file to point to the local repository.

Next make sure the imagePullPolicy is set to Always for security reasons which is one of the reasons we local the images locally so we’re not constantly pulling from the internet.

Private Repository

In order to access our private gitlab server and private projects, we’ll want to create a ssh public/private key pair. Simply press enter for passwordless access. Note you’ll want to save the keypair somewhere safe in case you need to use it again. For ArgoCD, you’ll be creating tiles and repository entries for each project.

ssh-keygen -t rsa

Next, in gitlab, access Settings and SSH Keys and add your new public key. I called mine ArgoCD so I knew which one to manage.

You’ll need to add an entry in ArgoCD in the Settings, Repository Certificates and Known Hosts. Since I have several repos on my gitlab server, I simply logged into my bldr0cuomgit1 server and copied the single line for the gitlab server from the known_hosts file. Then clicked the Add SSH Knwon Hosts button and added it to the list. If you don’t do this, you’ll get a known_hosts error when ArgoCD tries to connect to the repo. You can click the Skip server verification box when creating a connection to bypass this however it’s not secure.

Next in ArgoCD, in the Settings, Repositories section, you’ll be creating a connection to the repository for the project. Enter the following information for my llamas installation.

Name: GitOps Repo
Project: gitops
URL: git@lnmt1cuomgitlab.internal.pri:external-unix/gitops.git
SSH private key data: [ssh private key]

Click Connect and you should get a ‘Successful‘ response for the repo.

TLS Update

Per the troubleshooting section below, update the argocd-cmd-params-cm ConfigMap to add a data.server.insecure: “true” section. This ensures ArgoCD works with the haproxy-ingress controller.

Installation

Once done, create the argocd namespace file, argocd.yaml then apply it.

apiVersion: v1
kind: Namespace
metadata:
  name: argocd
kubectl apply -f argocd.yaml

Now that the namespace is created, create the argocd installations by applying the install.yaml file.

kubectl create -f install.yaml

It’ll take a few minutes for everything to start but once up, it’s all available.

In order to access the User Interface, you’ll need to create an argocd.dev.internal.pri alias to the HAProxy Load Balancer. In addition, you’ll need to apply the ingress.yaml file so you can access the UI.

kubectl apply -f ingress.yaml

Command Line Interface

Make sure you pull the argocd binary file which gives you CLI access to the argocd server.

Troubleshooting

After getting the haproxy-ingress controller installed and running, adding an ingress route to ArgoCD was failing. I mean, it was applied successfully however I was getting the following error from the argocd.dev.internal.pri website I’d configured:

The page isn’t redirecting properly

A quick search found the TLS Issue mentioned in the bug report (see References) which sent me over to the Multiple Ingress Objects page. At the end of the linked block of information was this paragraph:

The API server should then be run with TLS disabled. Edit the argocd-server deployment to add the --insecure flag to the argocd-server command, or simply set server.insecure: "true" in the argocd-cmd-params-cm ConfigMap

And it referred me to the ConfigMap page and I made the following update on the fly (we’ll need to fix it in the GitOps repo though).

kubectl edit configmap argocd-cmd-params-cm -n argocd

Which brought up a very minimal configmap.

apiVersion: v1
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/name: argocd-cmd-params-cm
    app.kubernetes.io/part-of: argocd
  name: argocd-cmd-params-cm
  namespace: argocd

I made the following change and restarted the argocd-server and I have access both to the UI and to be able to use the argocd CLI. Make sure true is in quotes though or you’ll get an error.

apiVersion: v1
data:
  server.insecure: "true"
kind: ConfigMap
metadata:
  labels:
    app.kubernetes.io/name: argocd-cmd-params-cm
    app.kubernetes.io/part-of: argocd
  name: argocd-cmd-params-cm
  namespace: argocd

External Clusters

I want to be able to use ArgoCD on the main cluster in order to push updates to remote clusters. Basically manage the llamas band website from one location. In order to do this, I need to connect the clusters together. In order to do this, I need to log in to the main ArgoCD cluster with the command line tool, argocd. Then make sure the working area has access to all the clusters in the .kube/config file. And finally use the argocd cli to connect to the clusters.

The main thing I find with many articles is the assumption of information. While I’ve provided links to where I found information, here I provide extra information that may have been left out of the linked article.

Login to ArgoCD

Logging into the main dev argocd environment is pretty easy in general. I had a few problems but eventually with help got logged in. The main thing was using the flags needed to get in. It took several tries and understanding what I was trying to get in to before I got logged in.

First off, I had to realize that I should be logged into the argocd ingress URL. In my case, argocd.dev.internal.pri. I still had a few issues and ultimately had the following error:

$ argocd login argocd.dev.internal.pri --skip-test-tls --grpc-web
Username: admin
Password:
FATA[0003] rpc error: code = Unknown desc = Post "https://argocd.dev.internal.pri:443/session.SessionService/Create": x509: certificate is valid for ingress.local, not argocd.dev.internal.pri

I posted up a call for help as I was having trouble locating a solution and eventually someone took pity and provided the answer. The –insecure flag. Since I was already using –skip-test-tls, I didn’t even think to see if there was such a flag. And it worked.

$ argocd login argocd.dev.internal.pri --skip-test-tls --grpc-web --insecure
Username: admin
Password:
'admin:login' logged in successfully
Context 'argocd.dev.internal.pri' updated

Merge Kubeconfig

Next, in order for argocd to have sufficient access to the other clusters, you need to merge configuration files to a single config. You might want to create a service account with admin privileges to separate it away from the kubernetes-admin account. Since this is my homelab, for now I’m simply using the kubernetes-admin account.

Problem though, in the .kube/config file, the authinfo is the same name for each cluster, kubernetes-admin. But since it’s a label and the actual account is kubernetes-admin@bldr, you can just change each label to get a unique authinfo entry. Back up all files before working on them of course.

$ kubectl config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
kubernetes-admin@bldr bldr kubernetes-admin
kubernetes-admin@cabo cabo kubernetes-admin
kubernetes-admin@lnmt lnmt kubernetes-admin
kubernetes-admin@tato tato kubernetes-admin

When you do the merge as shown below, there’ll be just one set of configs for kubernetes-admin and you won’t be able to access the other clusters. What I did was in each unique config file, I changed the label, then merged them together. Under contexts, change the user to kubernetes-bldr

contexts:
- context:
cluster: bldr
user: kubernetes-bldr
name: kubernetes-admin@bldr

And in the users section, also change the name to match.

users:
- name: kubernetes-bldr

With the names changed, you can now merge the files together. I’ve named mine after each of the clusters so I have bldr, cabo, tato, and lnmt. If you have files in a different location, add the path to the files.

export KUBECONFIG=bldr:cabo:tato:lnmt

And then merge them into a single file.

kubectl config view --flatten > all-in-one.yam

Check the file to make sure it at least looks correct, copy it to .kube/config, and then check the contexts.

$ kubectl config get-contexts
CURRENT NAME CLUSTER AUTHINFO NAMESPACE
kubernetes-admin@bldr bldr kubernetes-bldr
kubernetes-admin@cabo cabo kubernetes-cabo
kubernetes-admin@lnmt lnmt kubernetes-lnmt
kubernetes-admin@tato tato kubernetes-tato

See, AUTHINFO is all different now. Change contexts to one of the other clusters and check access. Once it’s all working, you should now be able to add them to ArgoCD.

Cluster Add

Now to the heart of the task, adding the remote clusters to ArgoCD. Now that we’re logged in and have access to all clusters from a single .kube/config file, we can add them to ArgoCD.

$ argocd cluster add kubernetes-admin@cabo --name cabo0cuomvip1.qa.internal.pri
WARNING: This will create a service account argocd-manager on the cluster referenced by context kubernetes-admin@cabo with full cluster level privileges. Do you want to continue [y/N]? y
INFO[0005] ServiceAccount "argocd-manager" created in namespace "kube-system"
INFO[0005] ClusterRole "argocd-manager-role" created
INFO[0005] ClusterRoleBinding "argocd-manager-role-binding" created
INFO[0010] Created bearer token secret for ServiceAccount "argocd-manager"
Cluster 'https://cabo0cuomvip1.qa.internal.pri:6443' added

And it’s added. Check the GUI, Settings, Clusters and you should see it there.

References

Posted in CI/CD, Computers, Kubernetes | Tagged , | 1 Comment

Ingress Controller

Overview

There are multiple IP assignments used in Kubernetes. In addition to the internal networking (by Calico in this case), you can install an Ingress Controller to manage access to your applications. This article provides some basic Service information as I explore the networking and work towards exposing my application(s) externally using an Ingress Router.

Networking

You can manage traffic for your network by either using a Layer 2 or Layer 3 device or by using an Overlay network. Due to the requirement of maintaining pod networks in a switch, the easiest method is using an Overlay network. This encapsulates network traffic using VXLAN (Virtual Extensible LAN) and tunnels it to other worker nodes in the cluster.

Services

I’ll be providing information on the three ways to provide access to applications using a Service context and any plusses and minuses.

ClusterIP

When you create a service, the default configuration assigns a ClusterIP from a pool of IPs defined as the Service Network when you created the Kubernetes cluster. The Service Network is how pods communicate with each other in the cluster. In my configuration, 10.69.0.0/16 is the network I assigned to the Service Network. When I look at a set of services, every one will have a 10.69.0.0/16 IP address.

$ kubectl get svc
NAMESPACE            NAME                                      TYPE           CLUSTER-IP      EXTERNAL-IP   PORT(S)                        AGE
default              kubernetes                                ClusterIP      10.69.0.1       <none>        443/TCP                        8d
default              my-nginx                                  NodePort       10.69.210.167   <none>        8080:31201/TCP,443:31713/TCP   11m

NodePort

Configuring a service with the type of NodePort is probably the easiest. You’re defining an externally accessible port in the range of 30,000-32,767 that is associated with your application’s port.

apiVersion: v1
kind: Service
metadata:
  name: my-nginx
  namespace: default
  labels:
    run: my-nginx
spec:
  type: NodePort
  ports:
  - name: http
    nodePort: 30100
    port: 8080
    targetPort: 80
    protocol: TCP
  - name: https
    nodePort: 30110
    port: 443
    protocol: TCP
  selector:
    run: my-nginx

When you check the services, you’ll see the ports that were randomly assigned if you didn’t define them.

$ kubectl get svc
NAME               TYPE        CLUSTER-IP      EXTERNAL-IP      PORT(S)                        AGE
my-nginx           NodePort    10.69.91.108    <none>           8080:30100/TCP,443:30110/TCP   11h

Anyway, when using NodePort, you simply access the API Server IP Address and tack on the port. With that you have access to the application.

https://bldr0cuomvip1.dev.internal.pri:30110

The positive aspect here is regardless of which worker node the container is running on, you always have access. But the problem with this method is your load balancer has to know about the ports and update the configuration, plus your firewall has to allow access to either a range of ports or an entry for each port. Not killer but can complicate things especially if you’re not assigning the NodePort. Infrastructure as Code does help with managing the Load Balancer and Firewall configurations pretty well.

Side note, you can also access any worker node with the defined port number and Kubernetes will route you to the correct node. Certainly accessing the API server with the port number is optimum.

ExternalIPs

The use of externalIPs lets you access an application/container via the IP of the worker node the app is running on. You can set a new DNS entry so you can access the application without using a default port or the defined port (common ones being 8080 for example).

You’d update the above service to add the externalIPs line. This would be the IP of the worker node the container is running on. In order to add the line you’ll need to get the list of workers to see which node the container is running on.

$ kubectl get pods -o wide
NAME                               READY   STATUS    RESTARTS      AGE   IP              NODE                               NOMINATED NODE   READINESS GATES
curl                               1/1     Running   1 (17h ago)   18h   10.42.251.135   bldr0cuomknode1.dev.internal.pri   <none>           <none>
curl-deployment-7d9ff6d9d4-jz6gj   1/1     Running   0             12h   10.42.251.137   bldr0cuomknode1.dev.internal.pri   <none>           <none>
echoserver-6f54957b4d-94qm4        1/1     Running   0             45h   10.42.80.7      bldr0cuomknode3.dev.internal.pri   <none>           <none>
my-nginx-66689dbf87-9x6kt          1/1     Running   0             12h   10.42.80.12     bldr0cuomknode3.dev.internal.pri   <none>           <none>

We see the my-nginx pod is running on bldr0cuomknode3.dev.internal.pri. Get the IP for it and update the service (I know all my K8S nodes are 160-162 for control and 163-165 for workers so knode3 is 165).

$ kubectl edit svc my-nginx
# Please edit the object below. Lines beginning with a '#' will be ignored,
# and an empty file will abort the edit. If an error occurs while saving this file will be
# reopened with the relevant failures.
#
apiVersion: v1
kind: Service
metadata:
  creationTimestamp: "2023-04-06T01:49:28Z"
  labels:
    run: my-nginx
  name: my-nginx
  namespace: default
  resourceVersion: "1735857"
  uid: 439abcae-94d8-4810-aa44-2992d7a30a63
spec:
  clusterIP: 10.69.91.108
  clusterIPs:
  - 10.69.91.108
  externalIPs:
  - 192.168.101.165
  externalTrafficPolicy: Cluster
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - name: http
    nodePort: 32107
    port: 8080
    protocol: TCP
    targetPort: 80
  - name: https
    nodePort: 31943
    port: 443
    protocol: TCP
    targetPort: 443
  selector:
    run: my-nginx
  sessionAffinity: None
  type: NodePort
status:
  loadBalancer: {}

Then add the externalIPs: line as noted above. When done, check the services

$ kubectl get svc
NAME         TYPE        CLUSTER-IP      EXTERNAL-IP       PORT(S)                        AGE
echoserver   NodePort    10.69.249.118   <none>            8080:32356/TCP                 45h
kubernetes   ClusterIP   10.69.0.1       <none>            443/TCP                        8d
my-nginx     NodePort    10.69.91.108    192.168.101.165   8080:32107/TCP,443:31943/TCP   13h

If you check the pod output above, note that the echoserver is also on knode3 plus using port 8080. The issue here is containers can’t have services using the same ports as the first service will be the only one that responds. Either move the pod or change the port to a unique one.

$ kubectl get svc
NAME         TYPE        CLUSTER-IP      EXTERNAL-IP       PORT(S)                        AGE
echoserver   NodePort    10.69.249.118   192.168.101.165   8080:32356/TCP                 45h
kubernetes   ClusterIP   10.69.0.1       <none>            443/TCP                        8d
my-nginx     NodePort    10.69.91.108    192.168.101.165   8080:32107/TCP,443:31943/TCP   12h

Finally, the problem should be clear. If knode3 goes away, or goes into maintenance mode, or heck is replaced, the IP address is now different. You’ll need to check the pods, update the service to point to the new node, then update DNS to use the new IP address. And depending on the DNS TTL, it could take some time before the new IP address is returned. Also what if you have more than one pod for load balancing or if you’re using Horizontal Pod Autoscaling (HPA)?

Ingress-Controllers

I checked out several ingress controllers and because Openshift is using a HAProxy ingress controller, that’s what I went with. There are several others of course and you’re free to pick the one that suits you.

The benefit of an Ingress Controller is it combines the positive features of a NodePort and an ExternalIP. Remember with a NodePort, you access your application by using the Load Balancer IP or Worker Node IP, but with a unique port number. It’s annoying because you have to manage firewalls for all the ports. With an ExternalIP, you can assign that to a Service and create a DNS entry to point to that IP and folks can access the site through a well crafted DNS entry. The problem of course if if the node goes away, you have to update the DNS with the new node IP where the pod now resides.

An Ingress Controller installs the selected ingress pod which has a label. Then you create an Ingress route using that label in a metadata.annotation, then create a DNS entry that points to the Load Balancer IP. The Ingress route knows about the DNS entry and has the label so points the incoming traffic to the Ingress Controller which then sends traffic to the appropriate pod or pods regardless of worker.

Ingress Controller Installation

I’ve been in positions where I can’t use helm so I’ve not used it much but the haproxy-ingress controller is only installable via helm chart so this is a first for me. First add the helm binary, then the helm charts for the controller.

helm repo add haproxy-ingress https://haproxy-ingress.github.io/charts

Next is to create a custom values file, I called it haproxy-ingress-values.yaml

controller:
  hostNetwork: true

Then install the controller. This creates the ingress-controller namespace.

helm install haproxy-ingress haproxy-ingress/haproxy-ingress\
  --create-namespace --namespace ingress-controller\
  --version 0.14.2\
  -f haproxy-ingress-values.yaml

And that’s all there is to it. Next up is creating the necessary ingress rules for applications.

Ingress Controller Configuration

I’m going to be creating a real basic Ingress entry here to see how things work. I don’t need a lot of options but you should check out the documentation and feel free to adjust as necessary for your situation.

Initially I’ll be using a couple of examples I used when testing this process. In addition I have another document I used when I was managing Openshift which gave me the little hint on what I was doing wrong to this point.

There are two example sites I’m using to test this. One is from the kubernetes site (my-nginx) and one is from the haproxy-ingress site (echoserver) both linked in the References section.

my-nginx Project

The my-nginx project has several configuration files that make up the project. The one thing it doesn’t have is the ingress.yaml file needed for external access to the site. Following are the configurations used to build this site.

The configmap.yaml file provides data for the nginx web server.

apiVersion: v1
kind: ConfigMap
metadata:
  name: nginxconfigmap
data:
  default.conf: |
    server {
            listen 80 default_server;
            listen [::]:80 default_server ipv6only=on;

            listen 443 ssl;

            root /usr/share/nginx/html;
            index index.html;

            server_name localhost;
            ssl_certificate /etc/nginx/ssl/tls.crt;
            ssl_certificate_key /etc/nginx/ssl/tls.key;

            location / {
                    try_files $uri $uri/ =404;
            }
    }

For the nginxsecret.yaml Secret, you’ll first need to create a couple of ssl certificates using the openssl command.

openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
  -keyout /var/tmp/nginx.key -out /var/tmp/nginx.crt \
  -subj "/CN=my-nginx/O=my-nginx"

You’ll then copy the new keys into the nginxsecret.yaml file and add it to the cluster.

apiVersion: "v1"
kind: "Secret"
metadata:
  name: "nginxsecret"
  namespace: "default"
type: kubernetes.io/tls
data:
  tls.crt: "LS0tLS1CRUdJTiBDRVJUSUZJQ0FURS0..."
  tls.key: "LS0tLS1CRUdJTiBQUklWQVRFIEtFWS0..."

After applying the secret, you’ll need to apply the service which is used by Kubernetes to connect ports with a label which is associated with the deployment. Note ‘labels‘ is ‘my-nginx‘ and in the deployment.yaml file, it has the same ‘labels‘ line. Traffic coming to this service will go to any pod (ingress-controller pod) with this label.

apiVersion: v1
kind: Service
metadata:
  name: my-nginx
  labels:
    run: my-nginx
spec:
  type: NodePort
  ports:
  - port: 8080
    targetPort: 80
    protocol: TCP
    name: http
  - port: 443
    protocol: TCP
    name: https
  selector:
    run: my-nginx

Then apply the following deployment.yaml which will pull the nginx image from docker.io.

apiVersion: apps/v1
kind: Deployment
metadata:
  name: my-nginx
spec:
  selector:
    matchLabels:
      run: my-nginx
  replicas: 1
  template:
    metadata:
      labels:
        run: my-nginx
    spec:
      volumes:
      - name: secret-volume
        secret:
          secretName: nginxsecret
      - name: configmap-volume
        configMap:
          name: nginxconfigmap
      containers:
      - name: nginxhttps
        image: bprashanth/nginxhttps:1.0
        ports:
        - containerPort: 443
        - containerPort: 80
        volumeMounts:
        - mountPath: /etc/nginx/ssl
          name: secret-volume
        - mountPath: /etc/nginx/conf.d
          name: configmap-volume

When you check the service, because it’s a NodePort, you’ll see both the service ports (8080 and 443) and the exposed ports (31201 and 31713). The exposed ports can be used to access the application by going to the Load Balancer url and adding the port.

$ kubectl get svc
NAME         TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                        AGE
echoserver   NodePort    10.69.249.118   <none>        8080:32356/TCP                 9h
kubernetes   ClusterIP   10.69.0.1       <none>        443/TCP                        9d
my-nginx     NodePort    10.69.210.167   <none>        8080:31201/TCP,443:31713/TCP   27h

However that’s not an optimum process. You have to make sure users know what port is assigned and make sure the port is opened on your Load Balancer. With an Ingress Controller, you create a DNS CNAME that points to the Load Balancer and then apply this ingress.yaml route.

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: my-nginx
  namespace: default
  annotations:
    kubernetes.io/ingress.class: haproxy
spec:
  rules:
  - host: my-ingress.dev.internal.pri
    http:
      paths:
      - backend:
          service:
            name: my-nginx
            port:
              number: 8080
        path: /
        pathType: Prefix
  tls:
  - hosts:
    - my-ingress.dev.internal.pri

I created a my-ingress.dev.internal.pri DNS CNAME that points to bldr0cuomvip1.dev.internal.pri. When accessing https://my-ingress.dev.internal.pri, you’ll be directed to the my-ingress service which then transmits traffic to the application pod regardless of which worker node it resides on.

Let’s break this down just a little for clarity, in part because it didn’t click for me without some poking around and having a ping moment when looking at an old document I created for an Openshift cluster I was working on.

In the ingress.yaml file, the spec.rules.host and spec.tls.hosts lines are the DNS entries you created for the pod(s). The ingress controller looks for this service and transmits traffic to the configured service.

The spec.rules.http.backend.service.name is the name of the service this ingress route transmits traffic to. The service.port.number is the port listed in the pod service.

The path line is interesting. You can have multiple directories accessible by different DNS names by changing the path line. In general this is a single website so the / for the path is appropriate for a majority of cases.

The thing that is important is the annotations line. It has to point to the ingress controller. For the haproxy-ingress-controller, it’s as listed but you can verify by describing the pod .

kubectl describe pod haproxy-ingress-7bc69b8cc-wq2hc  -n ingress-controller
...
    Args:
      --configmap=ingress-controller/haproxy-ingress
      --ingress-class=haproxy
      --sort-backends
...

In this case we see the passed argument of ingress-class = haproxy. This is the same as the annotations line and tells the ingress route which pod is load balancing traffic within the cluster.

Once applied, you can then go to https://my-ingress.dev.internal.pri and access the nginx startup page.

echoserver Project

This one is a little simpler but still can show us how to use an ingress route to access a pod.

All you need is a service.yaml file to know where to transmit traffic.

apiVersion: v1
kind: Service
metadata:
  labels:
    app: echoserver
  name: echoserver
  namespace: default
spec:
  clusterIP: 10.69.249.118
  clusterIPs:
  - 10.69.249.118
  externalTrafficPolicy: Cluster
  internalTrafficPolicy: Cluster
  ipFamilies:
  - IPv4
  ipFamilyPolicy: SingleStack
  ports:
  - nodePort: 32356
    port: 8080
    protocol: TCP
    targetPort: 8080
  selector:
    app: echoserver
  sessionAffinity: None
  type: NodePort

Then a deployment.yaml file to load the container.

apiVersion: apps/v1
kind: Deployment
metadata:
  labels:
    app: echoserver
  name: echoserver
  namespace: default
spec:
  progressDeadlineSeconds: 600
  replicas: 1
  revisionHistoryLimit: 10
  selector:
    matchLabels:
      app: echoserver
  strategy:
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%
    type: RollingUpdate
  template:
    metadata:
      creationTimestamp: null
      labels:
        app: echoserver
    spec:
      containers:
      - image: k8s.gcr.io/echoserver:1.3
        imagePullPolicy: IfNotPresent
        name: echoserver
        resources: {}
        terminationMessagePath: /dev/termination-log
        terminationMessagePolicy: File
      dnsPolicy: ClusterFirst
      restartPolicy: Always
      schedulerName: default-scheduler
      securityContext: {}
      terminationGracePeriodSeconds: 30

For me, the problem was the example was a single line to create the ingress route but it wasn’t enough information to help me create the route. A lot of the problems with examples is they’re expecting cloud usage and you’ll have an AWS, GCE, or Azure load balancer. For on prem it seems to be less obvious in the examples which is why I’m doing it this way. It helps me and may help others.

Here is the ingress.yaml file I used to access the application. Remember you have to create a DNS CNAME for the access and you’ll need the port number from the service definition (8080).

apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
  name: echoserver
  namespace: default
  annotations:
    kubernetes.io/ingress.class: haproxy
spec:
  rules:
  - host: echoserver.dev.internal.pri
    http:
      paths:
      - backend:
          service:
            name: echoserver
            port:
              number: 8080
        path: /
        pathType: Prefix
  tls:
  - hosts:
    - echoserver.dev.internal.pri

And with this ingress route, you have access to the echoserver pod. As I progress in loading tools and my llamas website, I’ll provide the ingress.yaml file so you can see how it’s done.

References

Posted in Computers, Kubernetes | Tagged , , , , , | 1 Comment

Persistent Storage

Overview

In this article I’ll configure and verify Persistent Storage for the Kubernetes cluster.

Installation

This is a simple installation. The NFS server has 100 gigs of space which will be used for any Persistent Volume Claims (PVCs) needed by application.

Apply the following storage-pv.yaml file.

apiVersion: v1
kind: PersistentVolume
metadata:
  name: storage-pv
spec:
  capacity:
    storage: 100Gi
  accessModes:
    - ReadWriteMany
  persistentVolumeReclaimPolicy: Retain
  nfs:
    path: /srv/nfs4/storage
    server: 192.168.101.170

Verify by checking the PV in the cluster.

$ kubectl get pv
NAME         CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS   REASON   AGE
storage-pv   100Gi      RWX            Retain           Available                                   7s

And that’s it. Storage is now available for any applications.

Posted in Computers, Kubernetes | Tagged , , | 1 Comment

Kubernetes Networking

Overview

This article provides instructions in installing the networking layer to the Kubernetes clusters.

Calico Networking

You’ll need to install Calico which is the network layer for the cluster. There are two files you’ll retrieve from Tigera who makes Calico. The tigera-operator.yaml and custom-resources.yaml files.

In the custom-resources.yaml file, update the spec.calicoNetwork.ipPools.cidr line to point to the PodNetwork. In my case, 10.42.0.0/16.

In the tigera-operator.yaml file, update the image: line to point to the on-prem insecure registry and any imagePullPolicy lines to Always.

Once done, use kubectl to install the two configurations. First the tigera-operator.yaml file, then the custom-resources.yaml file.

kubectl create -f tigera-operator.yaml
kubectl create -f custom-resources.yaml

When done and all is working, you should also see several calico pods start up.

$ kubectl get pods -A | grep -E "(calico|tigera)"
calico-apiserver   calico-apiserver-6fd86fcb4b-77tld                         1/1     Running   0             32m
calico-apiserver   calico-apiserver-6fd86fcb4b-p6bzc                         1/1     Running   0             32m
calico-system      calico-kube-controllers-dd6c88556-zhg6b                   1/1     Running   0             45m
calico-system      calico-node-66fkb                                         1/1     Running   0             45m
calico-system      calico-node-99qs2                                         1/1     Running   0             45m
calico-system      calico-node-dtzgf                                         1/1     Running   0             45m
calico-system      calico-node-ksjpr                                         1/1     Running   0             45m
calico-system      calico-node-lhhrl                                         1/1     Running   0             45m
calico-system      calico-node-w8nmx                                         1/1     Running   0             45m
calico-system      calico-typha-69f9d4d5b4-vp7mp                             1/1     Running   0             44m
calico-system      calico-typha-69f9d4d5b4-xv5tg                             1/1     Running   0             45m
calico-system      calico-typha-69f9d4d5b4-z65kn                             1/1     Running   0             44m
calico-system      csi-node-driver-5czsp                                     2/2     Running   0             45m
calico-system      csi-node-driver-ch746                                     2/2     Running   0             45m
calico-system      csi-node-driver-gg9f4                                     2/2     Running   0             45m
calico-system      csi-node-driver-kwbwp                                     2/2     Running   0             45m
calico-system      csi-node-driver-nh564                                     2/2     Running   0             45m
calico-system      csi-node-driver-rvfd4                                     2/2     Running   0             45m
tigera-operator    tigera-operator-7d89d9444-4scfq                           1/1     Running   0             45m

It does take a bit so give it some time to get going.

Troubleshooting

I did have a problem with the installation the first time as I hadn’t updated the custom-resources.yaml file to update the cidr line with my podnetwork configuration. After rebuilding the cluster, I updated and reapplied and it worked. One other issue was crio wasn’t enabled or started on the first control node for some reason. Once it was enabled and started, it worked as expected.

Posted in Computers, Kubernetes | Tagged , | 1 Comment

Kubernetes Metrics Server

Overview

The metrics server collects metrics from your kubernetes cluster. It’s also used by the Horizontal Pod Autoscaling (HPA) function to let you manage pods.

Installation

For my clusters, it’s a pretty simple configuration. I retrieve the components.yaml file from the metrics-server github site (see References below), compare it with the previous version if any, retrieve the images, tag, and push them to the local repository, then update the components.yaml file to point to the local repository. When done, simply apply it to the cluster.

kubectl apply -f components.yaml

Issue

I found that if I add the line to the KubeletConfiguration block when initializing the cluster, it’ll be added to the appropriate config.yaml files and the kubelet-config configmap in the cluster. I will leave this here as a reminder in case it pops up.

There is one issue that has to be addressed. See the References section for a link. Basically one of the flags indicates a preference for using IP Addresses for pods before external IPs or hostnames. Since IP Addresses weren’t part of the cluster build, metrics-server won’t start, generating tons of certificate errors. Of course you can move Hostname to the front of the line but you’re adding a DNS lookup to your list of tasks. You can also add a ignore tls flag which of course isn’t secure.

kube-system       metrics-server-5597479f8d-fn8xm                           0/1     Running   0               13h

What to do?

First you’ll need to edit the kubelet-config configmap and add serverTLSBootstrap: true right after the kind: KubeletConfiguration line and save it.

$ kubectl edit configmap kubelet-config -n kube-system
configmap/kubelet-config edited

Next you’ll have to edit every control node and worker node’s /var/lib/kubelet/config.yaml file and add the same line at the same place and restart kubelet.

Finally Certificate Requests (csr) will be created for each node. You’ll need to approve each CSR.

$ kubectl get csr
NAME        AGE   SIGNERNAME                      REQUESTOR                                      REQUESTEDDURATION   CONDITION
csr-4kr8m   20s   kubernetes.io/kubelet-serving   system:node:bldr0cuomkube3.dev.internal.pri    <none>              Pending
csr-fqpvs   28s   kubernetes.io/kubelet-serving   system:node:bldr0cuomknode3.dev.internal.pri   <none>              Pending
csr-m526d   27s   kubernetes.io/kubelet-serving   system:node:bldr0cuomkube2.dev.internal.pri    <none>              Pending
csr-nc6t7   27s   kubernetes.io/kubelet-serving   system:node:bldr0cuomkube1.dev.internal.pri    <none>              Pending
csr-wxhfd   28s   kubernetes.io/kubelet-serving   system:node:bldr0cuomknode1.dev.internal.pri   <none>              Pending
csr-z42x4   28s   kubernetes.io/kubelet-serving   system:node:bldr0cuomknode2.dev.internal.pri   <none>              Pending
$ kubectl certificate approve csr-4kr8m
certificatesigningrequest.certificates.k8s.io/csr-4kr8m approved

During this process, if you’re monitoring the pods, you’ll see the metrics-server start. It’s because you’ve approved the csr on the node where the metrics-server is running. Make sure you do all the servers.

kube-system       metrics-server-5597479f8d-fn8xm                           1/1     Running   0               13h

Issue

I’m still working through this but if I start the metrics-server before or after Calico is started, it requires the pod to be deleted to actually get metrics. Will try a few more installations to see if I can identify just exactly when the metrics-server should be started.

References

  • https://github.com/kubernetes-sigs/metrics-server
  • https://kubernetes.io/docs/tasks/administer-cluster/kubeadm/kubeadm-certs/#kubelet-serving-certs
  • https://github.com/kubernetes-sigs/metrics-server/issues/196 – This helped me resolve the issue by mainly pointing to the actual docs but there are good troubleshooting info here.
Posted in Computers, Kubernetes | Tagged , , , | 1 Comment

Installing Kubernetes

Overview

This article provides instructions in building the Kubernetes cluster using kubeadm and any post installation requirements.

Build Cluster

On the first control plane node run the kubeadm command.

kubeadm init --config kubeadm-config.yaml --upload-certs

After the first node has been initialized, the connect strings will be provided to join the two control nodes and three worker nodes to the new cluster. Add in the second control plane node using the string and then the third node. Do them in order as the third one will time out while the second one is pulling images.

When all three control plane nodes are up, use the worker connect string from the first control plane node and add in all three worker nodes. They can be added in parallel or sequentially but they do get added quickly.

You can then check the status of the cluster.

$ kubectl get nodes
NAME                               STATUS   ROLES           AGE   VERSION
bldr0cuomknode1.dev.internal.pri   Ready    <none>          8d    v1.25.7
bldr0cuomknode2.dev.internal.pri   Ready    <none>          8d    v1.25.7
bldr0cuomknode3.dev.internal.pri   Ready    <none>          8d    v1.25.7
bldr0cuomkube1.dev.internal.pri    Ready    control-plane   8d    v1.25.7
bldr0cuomkube2.dev.internal.pri    Ready    control-plane   8d    v1.25.7
bldr0cuomkube3.dev.internal.pri    Ready    control-plane   8d    v1.25.7

And check all the pods as well to make sure everything is running as expected.

$ kubectl get pods -A
NAMESPACE         NAME                                                      READY   STATUS    RESTARTS     AGE
kube-system       coredns-565d847f94-bp2c7                                  1/1     Running   2            8d
kube-system       coredns-565d847f94-twlvf                                  1/1     Running   0            3d17h
kube-system       etcd-bldr0cuomkube1.dev.internal.pri                      1/1     Running   0            4d
kube-system       etcd-bldr0cuomkube2.dev.internal.pri                      1/1     Running   1 (4d ago)   4d
kube-system       etcd-bldr0cuomkube3.dev.internal.pri                      1/1     Running   0            18h
kube-system       kube-apiserver-bldr0cuomkube1.dev.internal.pri            1/1     Running   0            4d
kube-system       kube-apiserver-bldr0cuomkube2.dev.internal.pri            1/1     Running   0            4d
kube-system       kube-apiserver-bldr0cuomkube3.dev.internal.pri            1/1     Running   0            18h
kube-system       kube-controller-manager-bldr0cuomkube1.dev.internal.pri   1/1     Running   0            4d
kube-system       kube-controller-manager-bldr0cuomkube2.dev.internal.pri   1/1     Running   0            4d
kube-system       kube-controller-manager-bldr0cuomkube3.dev.internal.pri   1/1     Running   0            18h
kube-system       kube-proxy-bpcfh                                          1/1     Running   1            8d
kube-system       kube-proxy-jl469                                          1/1     Running   1            8d
kube-system       kube-proxy-lrbh6                                          1/1     Running   2            8d
kube-system       kube-proxy-n9q4f                                          1/1     Running   2            8d
kube-system       kube-proxy-tf9wt                                          1/1     Running   1            8d
kube-system       kube-proxy-v66pt                                          1/1     Running   2            8d
kube-system       kube-scheduler-bldr0cuomkube1.dev.internal.pri            1/1     Running   0            4d
kube-system       kube-scheduler-bldr0cuomkube2.dev.internal.pri            1/1     Running   0            4d
kube-system       kube-scheduler-bldr0cuomkube3.dev.internal.pri            1/1     Running   0            18h

Certificate Signing Requests

When the cluster is up, due to the kubelet configuration updates, you’ll need to approve some CSRs. It’s an easy process to do with one caveat, the certs are only good for a year so you’ll need to do this again next year. Make a note.

$ kubectl get csr
NAME        AGE   SIGNERNAME                      REQUESTOR                                      REQUESTEDDURATION   CONDITION
csr-4kr8m   20s   kubernetes.io/kubelet-serving   system:node:bldr0cuomkube3.dev.internal.pri    <none>              Pending
csr-fqpvs   28s   kubernetes.io/kubelet-serving   system:node:bldr0cuomknode3.dev.internal.pri   <none>              Pending
csr-m526d   27s   kubernetes.io/kubelet-serving   system:node:bldr0cuomkube2.dev.internal.pri    <none>              Pending
csr-nc6t7   27s   kubernetes.io/kubelet-serving   system:node:bldr0cuomkube1.dev.internal.pri    <none>              Pending
csr-wxhfd   28s   kubernetes.io/kubelet-serving   system:node:bldr0cuomknode1.dev.internal.pri   <none>              Pending
csr-z42x4   28s   kubernetes.io/kubelet-serving   system:node:bldr0cuomknode2.dev.internal.pri   <none>              Pending
$ kubectl certificate approve csr-4kr8m
certificatesigningrequest.certificates.k8s.io/csr-4kr8m approved

Security Settings

Per the CIS group, several of the installed files need to be updated to ensure proper settings. Review the CIS documentation to see which files and directories need to be updated.

Image Updates

As noted earlier, update the kubernetes manifests to point to the local image registry. These files are on each of the control nodes in the /etc/kubernetes/manifests directory. In addition update the imagePullPolicy to Always. This ensures you always get the correct, uncorrupted image. Each kube and etcd containers will restart automatically when the manifest files are updated.

Conclusion

The cluster is up now. Now we’ll need to add the network management layer (Calico), metrics-server, ingress controller, and for the development cluster, a continuous delivery tool (argocd).

Posted in Computers, Kubernetes, Uncategorized | Tagged , | 1 Comment

Preparing Kubernetes

Overview

This article will provide a howto on preparing hosts to install Kubernetes 1.25.7 on CentOS 7 using kubeadm. I’ll be using CRI-O as the container environment and Calico for the network layer. A followup article will provide instructions in building the cluster and post installation needs.

Note that I tried Rocky Linux 8 but podman isn’t current enough for CRI-O and is throwing errors due to a change in the configuration file from a single entry to multiple entries.

Insecure Registries

Currently I’m using an on-prem insecure registry. I installed the docker distribution software which works well enough to host local images. Then on a docker server, I pull the necessary images, tag them with the local information, and then push them to the new local registry. Then I update kubernetes manifests and other tools to point to the local registry. With this, I’m not pulling images from the internet every time I make some change or another.

Prepare Hosts

There are a few things that need to be done with the hosts to make them ready.

Container Runtime

In order to use a container run time, you’ll need to create a couple of files. You’ll be creating a bridge and overlay file and modify the system with sysctl.

First time in /etc/modules-load-d create a br_netfilter.conf file.

br_netfilter

Next create the /etc/modules-load-d overlay.conf file.

br_netfilter

You can either restart the system or simply use modprobe to load the modules.

modprobe overlay
modprobe br_netfilter

Next create /etc/sysctl.d/kubernetes.conf and add the following lines:

net.bridge.bridge-nf-call-iptables = 1
net.bridge.bridge-nf-call-ip6tables = 1
net.ipv4.ip_forward = 1

Again, restart the system or simply reload the sysctl table:

sysctl --system

Disable swap

First off is to disable and remove swap from the all the nodes, control and worker. Since it’s Kubernetes and it manages resources, swap is not needed.

  • Remove the /dev/mapper/vg00-swap line from /etc/fstab
  • Remove rd.lvm.lv=vg00/swap from /etc/default/grub and run grub2-mkconconfig -o /boot/grub2/grub.cfg to rebuild the grub.cfg file.
  • Disable swap by running swapoff -v /dev/mapper/vg00-swap
  • Run the umount /dev/mapper/vg00-swap command to remove swap the run lvremove /dev/mapper/vg00-swap to recover the space.

If SELinux is configured, ensure the SELINUX line is set to permissive in /etc/selinux/config. You’ll need to reboot of course to enable this.

You may want to do some Quality of Service management. If so, install the iproute-tc tool. See the References section for further information on the software.

Firewalls

I have firewalls running around on my servers as I follow the zero-trust networking model, however because I’m using Calico for my network layer, it handles it for me so you need to disable the firewall on all nodes.

Docker

At least for CentOS 7 you’ll install the docker and tools on all nodes.

yum install -y docker docker-common docker-client

Configure docker to allow access to the on-prem insecure registries. Without this, docker will not pull the images. In addition, you want to use journald for logging. Update the /etc/docker/daemon.json file as follows:

{
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "10m",
    "max-file": "5"
  },
  "insecure-registries": ["bldr0cuomrepo1.dev.internal.pri:5000"]
}

In addition, update the docker system startup file and add the following flag.

--log-driver=journald

Container Runtime

Now the systems are ready for CRI-O. You’ll need to add a couple of repositories to your control nodes before doing the installation plus as of 1.24.0, you’ll have the option of selecting a CNI plugin. I’ll be using the containernetworking-plugins as that’s now it was set up but you have the option to select a different one if you like.

Configure Repositories

You’ll need to add the two repositories as provided below. While we can pull the files from the CRI-O website, as always we want consistency across the clusters. We are installing 1.24.0 on CentOS 7.

First the crio.repo file. Save it in /etc/yum.repos.d/crio.repo

[devel_kubic_libcontainers_stable_cri-o_1.24]
name=devel:kubic:libcontainers:stable:cri-o:1.24 (CentOS_7)
type=rpm-md
baseurl=https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable:/cri-o:/1.24/CentOS_7/
gpgcheck=1
gpgkey=https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable:/cri-o:/1.24/CentOS_7/repodata/repomd.xml.key
enabled=1

Next is the stable.repo. Again save it in /etc/yum.repos.d/stable.repo

[devel_kubic_libcontainers_stable]
name=Stable Releases of Upstream github.com/containers packages (CentOS_7)
type=rpm-md
baseurl=https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/CentOS_7/
gpgcheck=1
gpgkey=https://download.opensuse.org/repositories/devel:/kubic:/libcontainers:/stable/CentOS_7/repodata/repomd.xml.key
enabled=1

Install the crio package.

yum install crio

Then install the CNI of choice.

yum install containernetworking-plugins

In order for CRI-O to know about the on-prem insecure registries, you’ll need to update the /etc/containers/registries.conf. Add the following TOML formatted block of code.

[[registry]]
prefix = "bldr0cuomrepo1.dev.internal.pri:5000"
insecure = true
location = "bldr0cuomrepo1.dev.internal.pri:5000"

The pause container isn’t displayed when getting a listing of pods, but it’s used by Kubernetes to manage the network namespace so restarting or crashing pods don’t lose their network configuration. In order to point to a local insecure registry, you have to update the /etc/crio/crio.conf file with the following line:

pause_image = "bldr0cuomrepo1.dev.internal.pri:5000/pause:3.6"

When all are installed. Enable and start crio.

systemctl enable crio
systemctl start crio

Kubernetes Binaries

In order to install kubernetes binaries, you’ll first need to install the kubernetes repository into /etc/yum.repos.d. Create the file, kubernetes.repo and add the following lines.

[kubernetes]
name=Kubernetes
baseurl=https://packages.cloud.google.com/yum/repos/kubernetes-el7-x86_64
enabled=1
gpgcheck=1
repo_gpgcheck=1
gpgkey=https://packages.cloud.google.com/yum/doc/yum-key.gpg https://packages.cloud.google.com/yum/doc/rpm-package-key.gpg
exclude=kubelet kubeadm kubectl

And now, install the necessary binaries.

dnf install -y kubelet kubeadm kubectl --disableexcludes=kubernetes

Next enable kubelet. You won’t be able to start it as the config.yaml file doesn’t exist yet. That’s created when you run kubeadm.

systemctl enable kubelet

Build kubeadm Config

There are multiple options for the kubeadm-config.yaml file. Here is the one I’m using when building the cluster. This file should only be on the first control node as once the cluster is started, you’ll have commands to run to join other control and worker nodes to the first control node.

apiVersion: kubeadm.k8s.io/v1beta3
kind: InitConfiguration
nodeRegistration:
  imagePullPolicy: Always
---
apiVersion: kubeadm.k8s.io/v1beta3
clusterName: "bldr"
controlPlaneEndpoint: "bldr0cuomvip1.dev.internal.pri:6443"
etcd:
  local:
    imageRepository: "bldr0cuomrepo1.dev.internal.pri:5000"
imageRepository: "bldr0cuomrepo1.dev.internal.pri:5000"
kind: ClusterConfiguration
kubernetesVersion: "1.25.7"
networking:
  podSubnet: "10.42.0.0/16"
  serviceSubnet: "10.69.0.0/16"
---
apiVersion: kubelet.config.k8s.io/v1beta1
kind: KubeletConfiguration
serverTLSBootstrap: true

There are three sections here to detail.

InitConfiguration

For security purposes, we want to adjust all images to be pulled every time it needs to be loaded. Since the image repository is on-prem, having it be set to Always isn’t a big issue.

ClusterConfiguration

There are several options we’ll set to make sure we are running properly when initializing the cluster.

clusterName: I have four environments that will have clusters. The sites are bldr (dev), cabo (qa), tato (stage), and lnmt (production). Set this to one of the environments.

controlPlaneEndpoint: This is the HAProxy VIP along with the port of 6443.

imageRepository: This is the local image repository, in this case bldr0cuomrepo1.dev.internal.pri:5000. It’s set for the etcd binary and the three kubernetes binaries.

kubernetesVersion: Set it to the version being installed, in this case 1.25.7.

networking.podSubnet: Set to the network all the pods will be started on.

networking.serviceSubnet: Set to the network all internal services will use.

KubeletConfiguration

This is used by the metrics-server in order to access the cluster and return statistics. This setting will be applied to every servers kubelet config.yaml file plus to the cluster kubeadm-config configmap.

As a note, Certificate Signing Requests (CSRs) will need to be approved once the cluster is up.

Conclusion

The servers are all prepared and ready to be started. Log in to the first control node and follow the instructions for building the cluster.

References

Posted in Computers, Kubernetes | Tagged | 1 Comment

Kubernetes Storage

Overview

This article provides some quick instructions on creating an NFS server for use as Persistent Storage in Kubernetes. A different article will discuss creating Persistent Storage.

Firewall Configuration

For the NFS server, it only will be accessed by Kubernetes so we’ll restrict access to the NFS share to the environments network. To do that and not block access via ssh, we’ll create a new firewall zone called nfs. We’ll add nfs, rpc-bind, and mountd to that zone plus add the network range. Ultimately we’ll have the following configuration.

# firewall-cmd --zone nfs --list-all
nfs (active)
  target: default
  icmp-block-inversion: no
  interfaces:
  sources: 192.168.101.0/24
  services: mountd nfs rpc-bind
  ports:
  protocols:
  forward: no
  masquerade: no
  forward-ports:
  source-ports:
  icmp-blocks:
  rich rules:

NFS Configuration

To prepare the storage, we’ll create the three directories. We’re creating a registry directory for OpenShift/OKD4 although it’s not used in Kubernetes. I do have an OKD4 cluster that will use this storage as well.

mkdir -p /srv/nfs4
chmod 755 /srv/nfs4
chown -R root:root /srv

mkdir /srv/nfs4/registry
chmod 755 /srv/nfs4/registry
chown nobody:nobody /srv/nfs4/registry

mkdir /srv/nfs4/storage
chmod 755 /srv/nfs4/storage
chown nobody:nobody /srv/nfs4/storage

NFS Installation

Install the nfs-utils and python3-libselinux packages. Then create the /etc/exports file that creates the shared drives.

/srv/nfs4              192.168.101.0/24(rw,sync,no_subtree_check,crossmnt,fsid=0)
/srv/nfs4/registry     192.168.101.0/24(rw,sync,no_subtree_check,no_root_squash,no_all_squash,insecure,fsid=1)
/srv/nfs4/storage     192.168.101.0/24(rw,sync,no_subtree_check,no_root_squash,no_all_squash,insecure,fsid=2)

Export the file systems.

exportfs -ra

Enable and start the nfs-server.

systemctl enable nfs-server
systemctl start nfs-server

Verification

To make sure the shares are ready, run the following command.

# showmount --exports
Export list for bldr0cuomnfs1.dev.internal.pri:
/srv/nfs4/storage  192.168.101.0/24
/srv/nfs4/registry 192.168.101.0/24
/srv/nfs4          192.168.101.0/24

And finished.

Posted in Computers, Kubernetes | Tagged , , | 1 Comment

Load Balancing Kubernetes

Overview

This article provides instructions in how I set up my HAProxy servers (yes two) to provide access to the Kubernetes cluster.

Configuration

To emulate a production like environment, I’m configuring two HAProxy servers to provide access to the Kubernetes cluster. In order to ensure access to Kubernetes, I’m also installing keepalived. In addition, I’m using a tool called monit to ensure the haproxy binary continues to run in case it stops.

The server configuration isn’t gigantic. I am using my default CentOS 7.9 image though so it’s 2 CPUs, 4 Gigs of memory, and 100 Gigs of storage but only 32 Gigs are allocated.

HAProxy

I am making a few changes to the default installation of haproxy. In the global block the following configuration is in place.

global
        log /dev/log local0
        log /dev/log local1 notice
        chroot /var/lib/haproxy
        stats socket /var/lib/haproxy/admin.sock mode 660 level admin expose-fd listeners
        stats timeout 30s
        user haproxy
        group haproxy
        daemon

        # Default SSL material locations
        ca-base /etc/ssl/certs
        crt-base /etc/ssl/private

        # Default ciphers to use on SSL-enabled listening sockets.
        # For more information, see ciphers(1SSL). This list is from:
        #  https://hynek.me/articles/hardening-your-web-servers-ssl-ciphers/
        # An alternative list with additional directives can be obtained from
        #  https://mozilla.github.io/server-side-tls/ssl-config-generator/?server=haproxy
        ssl-default-bind-ciphers ECDH+AESGCM:DH+AESGCM:ECDH+AES256:DH+AES256:ECDH+AES128:DH+AES:RSA+AESGCM:RSA+AES:!aNULL:!MD5:!DSS
        ssl-default-bind-options no-sslv3

In the defaults block of the haproxy.cfg file, the following configuration is in place.

defaults
        log     global
        mode    http
        option  httplog
        option  dontlognull
        timeout connect 5s
        timeout client  50s
        timeout server  50s

I also added a listener so you can go to the web page and see various statistics on port 1936. Don’t forget to set the firewall to let you access the stats.

listen stats
        bind *:1936
        mode http
        log  global
        maxconn 10
        stats enable
        stats hide-version
        stats refresh 30s
        stats show-node
        stats show-desc Stats for the k8s cluster
        stats uri /
        monitor-uri /healthz/ready

There are two ports that need to be open for Kubernetes control plane nodes. Ports 6443 for the api server and 22623 for the machine config server. Set up the frontend and backend configurations as follows:

frontend kubernetes-api-server
        bind *:6443
        default_backend kubernetes-api-server
        mode tcp
        option tcplog

backend kubernetes-api-server
        mode tcp
        server bldr0cuomkube1 192.168.101.160:6443 check
        server bldr0cuomkube2 192.168.101.161:6443 check
        server bldr0cuomkube3 192.168.101.162:6443 check


frontend machine-config-server
        bind *:22623
        default_backend machine-config-server
        mode tcp
        option tcplog

backend machine-config-server
        mode tcp
        server bldr0cuomkube1 192.168.101.160:22623 check
        server bldr0cuomkube2 192.168.101.161:22623 check
        server bldr0cuomkube3 192.168.101.162:22623 check

For the worker nodes, the following configuration for ports 80 and 443 are required.

frontend ingress-http
        bind *:80
        default_backend ingress-http
        mode tcp
        option tcplog

backend ingress-http
        balance source
        mode tcp
        server bldr0cuomknode1-http-router0 192.168.101.163:80 check
        server bldr0cuomknode2-http-router1 192.168.101.164:80 check
        server bldr0cuomknode3-http-router2 192.168.101.165:80 check


frontend ingress-https
        bind *:443
        default_backend ingress-https
        mode tcp
        option tcplog

backend ingress-https
        balance source
        mode tcp
        server bldr0cuomknode1-http-router0 192.168.101.163:443 check
        server bldr0cuomknode2-http-router1 192.168.101.164:443 check
        server bldr0cuomknode3-http-router2 192.168.101.165:443 check

Before starting haproxy, you’ll need to do some configuration work. For logging, create the /var/log/haproxy directory as logs will be stored there.

Since we’re using chroot to isolate haproxy, create the /var/lib/haproxy/dev directory. Then create a socket for the logs:

python3 -c "import socket as s; sock = s.socket(s.AF_UNIX); sock.bind('/var/lib/haproxy/dev/log')"

To point to this new device, add the following configuration file to /etc/rsyslog.d called 49-haproxy.conf and restart rsyslog.

# Create an additional socket in haproxy's chroot in order to allow logging via
# /dev/log to chroot'ed HAProxy processes
$AddUnixListenSocket /var/lib/haproxy/dev/log

# Send HAProxy messages to a dedicated logfile
if $programname startswith 'haproxy' then /var/log/haproxy/haproxy.log
&~

keepalived

Since there are two servers, I have hap1 as the primary and hap2 as the secondary server. On the primary server, use the following configuration.

vrrp_script chk_haproxy {
    script "killall -0 haproxy"
    interval 2
    weight 2
}

vrrp_instance VI_1 {
    interface ens192
    state MASTER
    priority 200

    virtual_router_id 33
    unicast_src_ip 192.168.101.61
    unicast_peer {
        192.168.101.62
    }

    advert_int 1
    authentication {
        auth_type PASS
        auth_pass [unique password]
    }

    virtual_ipaddress {
        192.168.101.100
    }

    track_script {
        chk_haproxy
    }
}

And the backup server

vrrp_script chk_haproxy {
    script "killall -0 haproxy"
    interval 2
    weight 2
}

vrrp_instance VI_1 {
    interface ens192
    state BACKUP
    priority 100

    virtual_router_id 33
    unicast_src_ip 192.168.101.62
    unicast_peer {
        192.168.101.61
    }

    advert_int 1
    authentication {
        auth_type PASS
        auth_pass [Unique password]
    }

    virtual_ipaddress {
        192.168.101.100
    }

    track_script {
        chk_haproxy
    }
}

monit

The monit tool watches running processes and if the process ceases to exist, the tool restarts the process. It can be configured to notify admins as well. The following changes were made to the default monit configuration.

Note that the username and password appear to be hard coded into monit. The best I could do was ensure access was read-only.

set daemon  120              # check services at 2 minute intervals

set log /var/log/monit.log

set idfile /var/lib/monit/.monit.id

set statefile /var/lib/monit/.monit.state

set eventqueue
    basedir /var/lib/monit/events  # set the base directory where events will be stored
    slots 100                      # optionally limit the queue size

set httpd
    port 2812
     address 192.168.101.62                  # only accept connection from localhost
     allow 192.168.101.62/255.255.255.255    # allow localhost to connect to the server
     allow 192.168.101.90/255.255.255.255                    # allow connections from the tool server
     allow 192.168.0.0/255.255.0.0                           # allow connections from the internal servers
     allow admin:monit read-only   # require authentication

include /etc/monit.d/*


Posted in Computers, Kubernetes | Tagged , , , | 1 Comment