Out Of Memory Killer

In Linux, a process is in place called the OOM Killer that kills processes when memory starts getting too low.

While this is an interesting idea, it can kill your application before it kills something that might be a lower priority.

Of course, the best solution is to add more memory to a server. But if it’s not immediately possible, you can make some changes to memory management and the OOM Killer to make sure your application has the highest priority for memory usage.

In the /proc/[process id]/oom_adj file, is the rating of a process. By default, every process has a 0 (zero) rating. As time goes by, some processes gain priority and this file will drop. A quick look at one of my servers show a lot of zeros, a couple of -4’s, a bunch of -15’s, and a few -17’s. Based on the configuration of OOM Killer, any process with a zero rating will be killed before the -4, -15, or -17, which will be the last one killed.

In order to ensure the application has the highest priority, make a list of the processes (not process IDs) that are a lower priority such as monitoring or backup agents, and whip up a script that runs regularly. The script retrieves the PID of the listed process and sets the oom_adj value to 100. This ensures this lower priority process is killed before a higher, more important application is touched. I use 100 however it can be anything greater than zero.

#!/bin/bash

for i in $(oompriority)
do
  OOMPID=$(ps -e | awk '/${i}/{print $1}')
  if [[ ${OOMPID} -gt 0 ]]
  then
    echo 100 > /proc/${OOMPID}/oom_adj
  fi
done


Posted in Computers | Tagged | Leave a comment

Kubernetes Delete an etcd Member

On the Kubernetes cluster, one of the etcd members had a falling out and is reporting the data is stale. While troubleshooting, we came up with several ideas including just rebuilding the cluster. It’s not all that hard overall but still causes some angst because everyone gets new tokens and applications have to be redeployed.

The process itself is simple enough.

etcdctl member list
etcdctl member remove [member hex code]

Since it’s a TLS based node with certificates, you actually have to pass the certificate information on the command line. In addition, you may actually have to go into the pod to use its etcdctl command if you don’t have a current etcdctl binary installed.

The command is the same though, whether you’re in the pod itself (easy to do from a central console) or running it on one of the masters where the etcd certs are also installed.

etcdctl member list --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key
59721c313837f64a, started, bldr0cuomkube3.internal.pri, https://192.168.101.71:2380, https://192.168.101.71:2379, false
cd0ea44e64569de6, started, bldr0cuomkube2.internal.pri, https://192.168.101.73:2380, https://192.168.101.73:2379, false
e588b22b4be790ad, started, bldr0cuomkube1.internal.pri, https://192.168.101.72:2380, https://192.168.101.72:2379, false

Then you simply run the command again.

etcdctl member remove e588b22b4be790ad --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key

And the etcd member has been removed.

Posted in Computers, Kubernetes | Tagged , | Leave a comment

Turkey Pot Pie

This was a real good recipe I found on line when we had to do something with 22 lbs of turkey. We’ve been getting pot pies lately from the store so it was already something on my mind and I figured, let’s see what I can do.

Ingredients

  • 2 cups frozen peas and carrots or other mixed vegetables.
  • 2 cups frozen green beans
  • 1 cup chopped celery
  • 2/3 cup of butter (1 1/3 sticks)
  • 2/3 cup chopped onion
  • 2/3 cup all-purpose flour
  • 1 teaspoon salt
  • 1 teaspoon pepper
  • 1/2 teaspoon celery seed (I’ve had these around for quite some time)
  • 1/2 teaspoon onion powder
  • 1/2 teaspoon Italian seasoning
  • 1 3/4 cups of chicken broth
  • 1 1/3 cups milk
  • 4 cups cubed and cooked turkey meat. Half light, half dark.
  • 4 9 inch unbaked pie crusts. I used the Pillsbury ones.

Step 1 – Preheat the oven to 425F.

Step 2 – Cook the frozen vegetables and the chopped celery into a medium/large sauce pan. Boil and then simmer until the celery is tender. About 8 to 10 minutes. Drain the vegetables and set aside.

Step 3 – Melt the butter in the saucepan over medium heat, add the onion and cook until translucent, about 5 minutes. Then stir in the flour, salt, black pepper, celery seed, onion powder, and Italian seasoning. Then slowly whisk in the chicken broth and milk until the mixture thickens. It was pretty liquid for most and then I turned around for about 30 seconds and whomp, it was thickened. At that point, remove from the heat and stir in the cooked vegetables and turkey meat until well combined.

Step 4 – Fit two of the pie crusts into the bottom of the pie dishes. Spoon in half the vegetable and turkey filling into each of the dishes and cover with the second pie crust. As with a regular pie, pinch the edges together all the way around. You might cut a couple of slits into the top. I didn’t and it seemed to be fine.

Step 5 – Bake in the oven until the crusts are nice and brown. Between 30 and 35 minutes. Cover with aluminum foil if the tops are getting brown too fast. Once done, cool for about 10 minutes and serve.

Posted in Cooking | Tagged , , | Leave a comment

State of the Game Room 2020

My yearly list of what has arrived in the game room and maybe even what we’ve played over the past year.

History

Statistics

  • 153 New Entries in the Inventory database since December 30th 2019.
  • 22 Arkham Horror: The Card Game additions.
  • 30 Shadowrun additions.
  • 11 Dungeons & Dragons additions.
  • 17 other RPG purchases.
  • 27 new and used Board Games.
  • 13 new and used Card Games.

I say New and Used in a few places. This is because of two events over the past year. First is a friend and fellow gamer moving away from Colorado and back east to Virginia. Wen is a consummate gamer and an all around great guy. We miss him in Colorado. As part of his departure, he was selling off some gaming gear. As someone always on the lookout for games, I headed down and looked over his gear. While I did get a pretty good stack (filled up the trunk on the motorcycle), the ones of note are Formula E, which is Elephant racing! 😀 and an old game I used to have as a kid, Stratego.

Shadowrun

The second event was helping a fellow gaming who was having some personal troubles. He was selling some of his older, and hopefully not in use gear and making it available to the Shadowrun Facebook group. Bull is a pretty well known Ork so many folks stepped right on up. Generally I picked up some of this and some of that. In particular a Nerps pack of cards, a Shadowrun poster, Leviathan, and a few other bits.

Speaking of Shadowrun, I have a few interesting items this year. As part of a display of gear for the Shadowrun Facebook group, I snapped a pic of the crazy number of Limited Edition books that came out for Shadowrun 5th Edition. And it turns out I missed two for which I was able to easily track down. One from Catalyst itself and one on Amazon. And in addition, I snagged the Executive Edition for 6th.

The other picture was of all the miscellaneous Shadowrun kit like dice ($150 a pack!!!), pins, glasses, and posters. I even have a Chessex Vinyl Shadowrun cover.

A friend from work happened to get two copies of the Shadowrun Sprawl Ops board game and he gifted it to me knowing I play. We’ve had several discussions on Shadowrun before he was laid off.

But the best was tracking down the last two cards from the Denver Box Set. There are 6 plastic passes for use to travel between the sections of Denver and each box only has 2 meaning you had to hunt for the correct box. I was able to get two more from eBay back in 2006 when I got back into gaming and then stumbled upon someone selling a box with the last ones I needed, which included the Aztlan pass. This gave me a complete set of cards!

Due to the Covid Virus and mask requirement, when the Shadowrun Masks became available, I picked up 5 sets along with other bits like dice and an S Shadowrun pin.

Over the past year, Jamie from Atomic Goblin Games in Longmont Colorado, dumped some of his extra stuff that was just sitting around into my lap. As such I acquired several Netrunner bits as part the tournament kit and a pack of Star Wars X-Wing tournament bits. He’s also given me two sets of the Shadowrun miniatures and card decks as he gets them from Catalyst for free. He also got some Dark Souls miniatures expansions that I was able to get for his cost since I was the only one that played it.

The last bit of gear was from a posting on the work communications system (WebEx). We’d played some Munchkin this past year and I mentioned it and someone said there was a Munchkin RPG. Oh really. I was able to track down the available books and picked it up. Very fun reading.

reddit Questions

Several things this year has reduced the number of games we wanted to play. We were doing a lot of house hunting and finally purchased a house in August. There were a lot of different requirements and deadlines but we got it done. Add in all the moving hassle and less time was available for gaming. While the band was able to come up on weekends to practice, due to the drummer’s job change, we lost a lot of gaming time. Then Colorado went Red so we couldn’t even have guests over. Jeanne and I did get some gaming in. We both changed jobs towards the end of the year which had us doing more work related stuff; getting up to speed on the different technology for example.

Blog and Photos

Link to my blog where I go into more detail and have more pictures.

How Long Have You Been Gaming?

Well, I’m 63 now and started playing various games as a kid. My grandfather played gin and gin rummy with me when I was over and the adults played pinochle although I was able to play from time to time as well. We played all the standard games. Monopoly, Battleship, and even Chess. We started gaming more when we started doing Family Home Evening (Mormon thing) and I was introduced to Outdoor Survival, an Avalon Hill game. From there into wargaming and beyond!

Gaming This Year

We played quite a few games this year, in part due to Covid. Formula De, Raccoon Tycoon, Resident Evil, Munchkin, The Witches, Ticket to Ride (Rails and Sails), Splendor,The Doom That Came To Atlantic City, Car Wars,Nuclear War, Shadowrun Sprawl Ops, and Savage Worlds: Deadlands.

Favorite Board/Card Games

Of the past years plays: Probably the Resident Evil card game. That surprised me as being a pretty good game.

More current games: Ticket to Ride: Rails and Sails, Castles of Burgundy, Discoveries of Lewis and Clark, Bunny Kingdom, Formula De and Splendor.

Older: Car Wars, Cosmic Encounters, Nuclear War, and Ace of Aces.

Incoming and Outgoing

Generally Jamie down at Atomic Goblin Games will pick some out for me to check out. Other than Shadowrun books and paraphernalia, the only thing coming in is via Kickstarter and it’s the Steve Jackson Car Wars game.

As to outgoing, I just don’t do that. I was close to selling off a bunch of my collection back in the 90’s when I got into video games but I backed out and have since see many people who regretted getting rid of this game or that. I have the room for the gear, so it stays. Maybe next year. 🙂

Game Room Pictures

As to the game room, I did pick up another couple of Ikea Kallex boxes. A 4×4 one and a 2×4 one which I put on top of the 4×4 one resulting in a 6×4 configuration. Currently I have 4 5×5 shelves with a 1×4 shelf on top of each, 2 4×4 shelves with a 4×2 on top of each, a 1×4 shelf, and 2 2×4 shelves with a 2×2 on top of each for a total of 192 Kallex squares of games.

And Pictures! These are going from entrance left side clockwise around the room.

All of the pictures are linked here if you want to see bigger ones. Game On!

Posted in Gaming | Leave a comment

Kubernetes Ansible Upgrade to 1.19.6

Upgrading Kubernetes Clusters

This document provides a guide to upgrading the Kubernetes clusters in the quickest manner. Much of the upgrade process can be done using Ansible Playbooks. There are a few processes that need to be done centrally on the tool server. And the OS and control plane updates are also manual in part due to the requirement to manually remove servers from the Kubernetes API pool.

In most cases, examples are not provided as it is assumed that you are familiar with the processes and can perform the updates without having to be reminded of how to verify.

For any process that is performed with an Ansible Playbook, it is assumed you are on the lnmt1cuomtool11 server in the /usr/local/admin/playbooks/cschelin/kubernetes directory. All Ansible related steps expect to start from that directory. In addition, the application of pod configurations will be in the configurations subdirectory.

Perform Upgrades

Patch Servers

In the 00-osupgrade directory, you’ll be running the master and worker scripts. I recommend opening two windows, one for master and one for worker, and running each script with master -t [tag] and worker -t [tag]. This will verify a node is Ready, drain the node from the pool if a worker, perform a yum upgrade and reboot, uncordon again if a worker, and verify the nodes are Ready again. Should a node fail to be ready in time, the script will exit.

Update Versionlock And Components

In the 03-packages directory, run the update -t [tag] script. This will install yum-plugin-versionlock if missing, remove old versionlocks, create new versionlocks for kubernetes, kubernetes-cni, and docker, and then the components will be upgraded.

Upgrade Kubernetes

Using the kubeadm command on the first master server, upgrade the first master server.

# kubeadm upgrade apply 1.19.6

Upgrade Control Planes

On the second and third master, run the kubeadm upgrade apply 1.19.6 command and the control plane will be upgraded.

Update kube-proxy

Check the kube-proxy daemonset and update the image tag if required.

$ kubectl edit daemonset kube-proxy -n kube-system
  • Change image switching k8s.gcr.io with bldr0cuomrepo1.internal.pri:5000

Save the changes

Update coredns

Check the coredns-deployment and update the image tag if required.

$ kubectl edit deployment corednss -n kube-system
  • Change image switching k8s.gcr.io with bldr0cuomrepo1.internal.pri:5000

Save the changes.

Restart kubelet and docker

In the 04-kubelet directory, run the update -t [tag] script. This will restart kubelet and docker on all servers.

Calico Upgrade

In the configurations/calico directory, run the following command:

$ kubectl apply -f calico.yaml

calicoctl Upgrade

Pull the updated calicoctl binary and copy it to /usr/local/bin.

$ curl -O -L  https://github.com/projectcalico/calicoctl/releases/download/v3.16.0/calicoctl

Update File and Directory Permissions and Manifests

In the postinstall directory, run the update -t [tag] script. This will perform the following steps.

  • Add the cluster-name to the kube-controller-manager.yaml file
  • Update the imagePullPolicy and image lines to all manifests
  • Add the AlwaysPullImages and ResourceQuota admission controllers to the kube-apiserver.yaml file.
  • Update the permissions of all files and directories.

Filebeat Upgrade

In the configurations directory, change to the appropriate cluster context directory, bldr0-0, cabo0-0, tato0-1, and lnmt1-2 and run the following command.

$ kubectl apply -f filebeat-kubernetes.yaml

Posted in Computers, Kubernetes | Tagged | Leave a comment

Kubernetes Manual Upgrade to 1.19.6

Upgrading Kubernetes Clusters

This documentation is intended to provide the manual process for upgrading the server Operating Systems, Kubernetes to 1.19.6, and any additional updates. This provides example output and should help in troubleshooting should the automated processes experience a problem.

All of the steps required to prepare for an installation should be completed prior to starting this process.

Server and Kubernetes Upgrades

Patch Servers

As part of quarterly upgrades, the Operating Systems for all servers need to be upgraded.

For the control plane, there isn’t a “pool” so just patch each server and reboot it. Do one server at a time and check the status of the cluster before moving to subsequent master servers in the control plane.

For the worker nodes, you’ll need to drain each of the workers before patching and rebooting. Run the following command to both confirm the current version of 1.18.8 and that all nodes are in a Ready state to be patched:

$ kubectl get nodes
NAME                           STATUS   ROLES    AGE   VERSION
bldr0cuomknode1.internal.pri   Ready    <none>   90d   v1.18.8
bldr0cuomknode2.internal.pri   Ready    <none>   90d   v1.18.8
bldr0cuomknode3.internal.pri   Ready    <none>   90d   v1.18.8
bldr0cuomkube1.internal.pri    Ready    master   90d   v1.18.8
bldr0cuomkube2.internal.pri    Ready    master   90d   v1.18.8
bldr0cuomkube3.internal.pri    Ready    master   90d   v1.18.8

To drain a server, patch, and then return the server to the pool, follow the steps below:

$ kubectl drain [nodename] --delete-local-data --ignore-daemonsets

Then patch the server and reboot:

# yum upgrade -y
# shutdown -t 0 now -r

Finally bring the node back into the pool.

$ kubectl uncordon [nodename]

Update Versionlock Information

Currently the clusters have locked kubernetes to version 1.18.8, kubernetes-cni to version 0.8.6, and docker to 1.13.1-162. The locks on each server need to be removed and new locks put in place for the new versions of kubernetes, kubernetes-cni, and docker where appropriate.

Versionlock file location: /etc/yum/pluginconf.d/

Simply delete the existing locks:

/usr/bin/yum versionlock delete "kubelet.*"
/usr/bin/yum versionlock delete "kubectl.*"
/usr/bin/yum versionlock delete "kubeadm.*"
/usr/bin/yum versionlock delete "kubernetes-cni.*"
/usr/bin/yum versionlock delete "docker.*"
/usr/bin/yum versionlock delete "docker-common.*"
/usr/bin/yum versionlock delete "docker-client.*"
/usr/bin/yum versionlock delete "docker-rhel-push-plugin.*"

And then add in the new locks at the desired levels:

/usr/bin/yum versionlock add "kubelet-1.19.6-0.*"
/usr/bin/yum versionlock add "kubectl-1.19.6-0.*"
/usr/bin/yum versionlock add "kubeadm-1.19.6-0.*"
/usr/bin/yum versionlock "docker-1.13.1-203.*"
/usr/bin/yum versionlock "docker-common-1.13.1-203.*"
/usr/bin/yum versionlock "docker-client-1.13.1-203.*"
/usr/bin/yum versionlock "docker-rhel-push-plugin-1.13.1-203.*"
/usr/bin/yum versionlock "kubernetes-cni-0.8.7-0.*"

Then install the updated kubernetes and docker binaries. Note that the versionlocked versions and the installed version must match:

/usr/bin/yum install kubelet-1.19.6-0.x86_64
/usr/bin/yum install kubectl-1.19.6-0.x86_64
/usr/bin/yum install kubeadm-1.19.6-0.x86_64
/usr/bin/yum install docker-1.13.1-203.git0be3e21.el7_8.x86_64
/usr/bin/yum install docker-common-1.13.1-203.git0be3e21.el7*
/usr/bin/yum install docker-client-1.13.1-203.git0be3e21.el7*
/usr/bin/yum install docker-rhel-push-plugin-1.13.1-203.git0be3e21.el7*
/usr/bin/yum install kubernetes-cni-0.8.7-0.x86_64

Upgrade Kubernetes

Using the kubeadm command on the first master server, you can review the plan and then upgrade the cluster:

# kubeadm upgrade plan
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.18.8
[upgrade/versions] kubeadm version: v1.19.6
I1224 02:04:43.067987 8753 version.go:252] remote version is much newer: v1.20.1; falling back to: stable-1.19
[upgrade/versions] Latest stable version: v1.19.6
[upgrade/versions] Latest stable version: v1.19.6
[upgrade/versions] Latest version in the v1.18 series: v1.18.14
[upgrade/versions] Latest version in the v1.18 series: v1.18.14

Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT CURRENT AVAILABLE
kubelet 6 x v1.18.8 v1.18.14

Upgrade to the latest version in the v1.18 series:

COMPONENT CURRENT AVAILABLE
kube-apiserver v1.18.8 v1.18.14
kube-controller-manager v1.18.8 v1.18.14
kube-scheduler v1.18.8 v1.18.14
kube-proxy v1.18.8 v1.18.14
CoreDNS 1.6.7 1.7.0
etcd 3.4.3-0 3.4.3-0

You can now apply the upgrade by executing the following command:

kubeadm upgrade apply v1.18.14

_____________________________________________________________________

Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT CURRENT AVAILABLE
kubelet 6 x v1.18.8 v1.19.6

Upgrade to the latest stable version:

COMPONENT CURRENT AVAILABLE
kube-apiserver v1.18.8 v1.19.6
kube-controller-manager v1.18.8 v1.19.6
kube-scheduler v1.18.8 v1.19.6
kube-proxy v1.18.8 v1.19.6
CoreDNS 1.6.7 1.7.0
etcd 3.4.3-0 3.4.13-0

You can now apply the upgrade by executing the following command:

kubeadm upgrade apply v1.19.6

_____________________________________________________________________


The table below shows the current state of component configs as understood by this version of kubeadm.
Configs that have a "yes" mark in the "MANUAL UPGRADE REQUIRED" column require manual config upgrade or
resetting to kubeadm defaults before a successful upgrade can be performed. The version to manually
upgrade to is denoted in the "PREFERRED VERSION" column.

API GROUP CURRENT VERSION PREFERRED VERSION MANUAL UPGRADE REQUIRED
kubeproxy.config.k8s.io v1alpha1 v1alpha1 no
kubelet.config.k8s.io v1beta1 v1beta1 no
_____________________________________________________________________

There are likely newer versions of Kubernetes control plane containers available. In order to maintain consistency across all clusters, only upgrade the masters to 1.19.6:

# kubeadm upgrade apply 1.19.6
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade/version] You have chosen to change the cluster version to "v1.19.6"
[upgrade/versions] Cluster version: v1.18.8
[upgrade/versions] kubeadm version: v1.19.6
[upgrade/confirm] Are you sure you want to proceed with the upgrade? [y/N]: y
[upgrade/prepull] Pulling images required for setting up a Kubernetes cluster
[upgrade/prepull] This might take a minute or two, depending on the speed of your internet connection
[upgrade/prepull] You can also perform this action in beforehand using 'kubeadm config images pull'
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.19.6"...
Static pod: kube-apiserver-bldr0cuomkube1.internal.pri hash: 053014e49eb31dd44a1951df85c466b0
Static pod: kube-controller-manager-bldr0cuomkube1.internal.pri hash: f23e1c90dbf9b2b0893cd8df7ee5d987
Static pod: kube-scheduler-bldr0cuomkube1.internal.pri hash: a3899df34b823393426e8f7ae39d8dee
[upgrade/etcd] Upgrading to TLS for etcd
Static pod: etcd-bldr0cuomkube1.internal.pri hash: 8d44a23a44041edc0180dec7c820610d
[upgrade/staticpods] Preparing for "etcd" upgrade
[upgrade/staticpods] Renewing etcd-server certificate
[upgrade/staticpods] Renewing etcd-peer certificate
[upgrade/staticpods] Renewing etcd-healthcheck-client certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/etcd.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2020-12-24-21-50-13/etcd.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: etcd-bldr0cuomkube1.internal.pri hash: 8d44a23a44041edc0180dec7c820610d
Static pod: etcd-bldr0cuomkube1.internal.pri hash: ab0e3948b56eb191236044c56350be62
[apiclient] Found 3 Pods for label selector component=etcd
[upgrade/staticpods] Component "etcd" upgraded successfully!
[upgrade/etcd] Waiting for etcd to become available
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests840688942"
[upgrade/staticpods] Preparing for "kube-apiserver" upgrade
[upgrade/staticpods] Renewing apiserver certificate
[upgrade/staticpods] Renewing apiserver-kubelet-client certificate
[upgrade/staticpods] Renewing front-proxy-client certificate
[upgrade/staticpods] Renewing apiserver-etcd-client certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-apiserver.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2020-12-24-21-50-13/kube-apiserver.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-apiserver-bldr0cuomkube1.internal.pri hash: 053014e49eb31dd44a1951df85c466b0
Static pod: kube-apiserver-bldr0cuomkube1.internal.pri hash: 4279fd8bec56cdea97ff8f8f7f5547d3
[apiclient] Found 3 Pods for label selector component=kube-apiserver
[upgrade/staticpods] Component "kube-apiserver" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-controller-manager" upgrade
[upgrade/staticpods] Renewing controller-manager.conf certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-controller-manager.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2020-12-24-21-50-13/kube-controller-manager.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-controller-manager-bldr0cuomkube1.internal.pri hash: f23e1c90dbf9b2b0893cd8df7ee5d987
Static pod: kube-controller-manager-bldr0cuomkube1.internal.pri hash: 202ee2ffdb77add9d9f3327e4fd827fc
[apiclient] Found 3 Pods for label selector component=kube-controller-manager
[upgrade/staticpods] Component "kube-controller-manager" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-scheduler" upgrade
[upgrade/staticpods] Renewing scheduler.conf certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-scheduler.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2020-12-24-21-50-13/kube-scheduler.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-scheduler-bldr0cuomkube1.internal.pri hash: a3899df34b823393426e8f7ae39d8dee
Static pod: kube-scheduler-bldr0cuomkube1.internal.pri hash: 5a568caf05a8bd40ae4b30cf4dcd90eb
[apiclient] Found 3 Pods for label selector component=kube-scheduler
[upgrade/staticpods] Component "kube-scheduler" upgraded successfully!
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.19" in namespace kube-system with the configuration for the kubelets in the cluster
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.19.6". Enjoy!

[upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.

Update Control Planes

On the second and third master, run the kubeadm upgrade apply 1.19.6 command and the control plane will be upgraded.

Update File and Directory Permissions

Verify the permissions match the table below once the upgrade is complete:

Path or Fileuser:groupPermissions
/etc/kubernetes/manifests/etcd.yaml root:root 0644
/etc/kubernetes/manifests/kube-apiserver.yaml 0644
/etc/kubernetes/manifests/kube-controller-manager.yaml root:root0644
/etc/kubernetes/manifests/kube-scheduler root:root 0644
/var/lib/etcd root:root 0700
/etc/kubernetes/admin.conf root:root 0644
/etc/kubernetes/scheduler.conf root:root 0644
/etc/kubernetes/controller-manager.conf root:root 0644
/etc/kubernetes/pki root:root 0755
/etc/kubernetes/pki/ca.crt root:root 0644
/etc/kubernetes/pki/apiserver.crt root:root 0644
/etc/kubernetes/pki/apiserver-kubelet-client.crt root:root 0644
/etc/kubernetes/pki/front-proxy-ca.crt root:root 0644
/etc/kubernetes/pki/front-proxy-client.crt root:root 0644
/etc/kubernetes/pki/sa.pub root:root 0644
/etc/kubernetes/pki/ca.key root:root 0600
/etc/kubernetes/pki/apiserver.key root:root 0600
/etc/kubernetes/pki/apiserver-kubelet-client.key root:root 0600
/etc/kubernetes/pki/front-proxy-ca.key root:root 0600
/etc/kubernetes/pki/front-proxy-client.key root:root 0600
/etc/kubernetes/pki/sa.key root:root 0600
/etc/kubernetes/pki/etcd root:root 0755
/etc/kubernetes/pki/etcd/ca.crt root:root 0644
/etc/kubernetes/pki/etcd/server.crt root:root 0644
/etc/kubernetes/pki/etcd/peer.crt root:root 0644
/etc/kubernetes/pki/etcd/healthcheck-client.crt root:root 0644
/etc/kubernetes/pki/etcd/ca.key root:root 0600
/etc/kubernetes/pki/etcd/server.key root:root 0600
/etc/kubernetes/pki/etcd/peer.key root:root 0600
/etc/kubernetes/pki/etcd/healthcheck-client.key root:root 0600

Update Manifests

During the kubeadm upgrade, the current control plane manifests are moved from /etc/kubernetes/manifests into /etc/kubernetes/tmp and new manifest files deployed. There are multiple settings and permissions that need to be reviewed and updated before the task is considered completed.

The kubeadm-config configmap has been updated to point to bldr0cuomrepo1.internal.pri:5000 however it and the various container configurationsshould be checked anyway. One of the issues is if it’s not updated or used, you’ll have to make the update manually including manually editing the kube-proxy daemonset configuration.

Note that when a manifest is updated, the associated image is reloaded. No need to manage the pods once manifests are updated.

etcd Manifest

Verify and update etcd.yaml

  • Change imagePullPolicy to Always.
  • Change image switching k8s.gcr.io with bldr0cuomrepo1.internal.pri:5000

kube-apiserver Manifest

Verify and update kube-apiserver.yaml

  • Add AlwaysPullImages and ResourceQuota admission controllers to the –enable-admission-plugins line
  • Change imagePullPolicy to Always
  • Change image switching k8s.gcr.io with bldr0cuomrepo1.internal.pri:5000

kube-controller-manager Manifest

Verify and update kube-controller-manager.yaml

  • Add ” – –cluster-name=kubecluster-[site]” after ” – –cluster-cidr=192.168.0.0/16″
  • Change imagePullPolicy to Always
  • Change image switching k8s.gcr.io to bldr0cuomrepo1.internal.pri:5000

kube-scheduler Manifest

Verify and update kube-scheduler.yaml

  • Change imagePullPolicy to Always
  • Change image switching k8s,gcr.io to bldr0cuomrepo1.internal.pri:5000

Update kube-proxy

You’ll need to edit the kube-proxy daemonset to change the imagePullPolicy. Check the image tag at the same time.

$ kubectl edit daemonset kube-proxy -n kube-system
  • Change imagePullPolicy to Always.
  • Change image switching k8s.gcr.io with bldr0cuomrepo1.internal.pri:5000

Save the changes.

Update coredns

You” need to edit the coredns deployment to change the imagePullPolizy. Check the image tag at the same time.

$ kubectl edit deployment coredns -n kube-system
  • Change imagePullPolicy to Always
  • Change image switching k8s.gcr.io to bldr0cuomrepo1.internal.pri:5000

Save the changes

Restart kubelet

Once done, kubelet and docker needs to be restarted on all nodes.

systemctl daemon-reload
systemctl restart kubelet
systemctl restart docker

Verify

Once kubelet has been restarted on all nodes, verify all nodes are at 1.18.8.

$ kubectl get nodes
NAME                           STATUS   ROLES    AGE   VERSION
bldr0cuomknode1.internal.pri   Ready    <none>   91d   v1.19.6
bldr0cuomknode2.internal.pri   Ready    <none>   91d   v1.19.6
bldr0cuomknode3.internal.pri   Ready    <none>   91d   v1.19.6
bldr0cuomkube1.internal.pri    Ready    master   91d   v1.19.6
bldr0cuomkube2.internal.pri    Ready    master   91d   v1.19.6
bldr0cuomkube3.internal.pri    Ready    master   91d   v1.19.6

Configuration Upgrades

Configuration files are on the tool servers (lnmt1cuomtool11) in the /usr/local/admin/playbooks/cschelin/kubernetes/configurations directory and the expectation is you’ll be in that directory when directed to apply configurations.

Calico Upgrade

In the calico directory, run the following command:

$ kubectl apply -f calico.yaml
configmap/calico-config unchanged
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/kubecontrollersconfigurations.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org configured
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers unchanged
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers unchanged
clusterrole.rbac.authorization.k8s.io/calico-node unchanged
clusterrolebinding.rbac.authorization.k8s.io/calico-node unchanged
daemonset.apps/calico-node configured
serviceaccount/calico-node unchanged
deployment.apps/calico-kube-controllers configured
serviceaccount/calico-kube-controllers unchanged

After calico is applied, the calico-kube-controllers pod will restart and then the calico-node pod restarts to retrieve the updated image.

Pull the calicoctl binary and copy it to /usr/local/bin, then verify the version. Note that this has likely already been done on the tool server. Verify it before pulling the binary.

$ curl -O -L  https://github.com/projectcalico/calicoctl/releases/download/v3.17.1/calicoctl

Verification

$ calicoctl version
Client Version:    v3.17.1
Git commit:        8871aca3
Cluster Version:   v3.17.1
Cluster Type:      k8s,bgp,kubeadm,kdd

Update CNI File Permissions

Verify the permissions of the files once the upgrade is complete.

Path or Fileuser:groupPermissions
/etc/cni/net.d/10-calico-conflist root:root0644
/etc/cni/net.d/calico-kubeconfig root:root0644

metrics-server Upgrade

In the metrics-server directory, run the following command:

$ kubectl apply -f components.yaml
serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created

Once the metrics-server deployment has been updated, the pod will restart.

kube-state-metrics Upgrade

As noted, this pod doesn’t need to be upgraded.

Filebeat Upgrade

Filebeat uses Elastic Stack clusters in four environments. Filebeat itself is installed on all clusters. Ensure you’re managing the correct cluster when upgrading the filebeat container as configurations are specific to each cluster.

Change to the appropriate cluster context directory and run the following command:

$ kubectl apply -f filebeat-kubernetes.yaml
configmap/filebeat-config created
daemonset.apps/filebeat created
clusterrolebinding.rbac.authorization.k8s.io/filebeat created
clusterrole.rbac.authorization.k8s.io/filebeat created
serviceaccount/filebeat created

Verification

Essentially monitor each cluster. You should see the filebeat containers restarting and returning to a Running state.

$ kubectl get pods -n monitoring -o wide

Posted in Computers, Kubernetes | Tagged | Leave a comment

Kubernetes Preparation Steps for 1.19.6

Upgrading Kubernetes Clusters

The purpose of this document is to provide the background information on what is being upgraded, what versions, and the steps required to prepare for the upgrade itself. These steps are only done once. Once all these steps have been completed and all the configurations checked into github and gitlab, all clusters are then ready to be upgraded.

Reference links to product documentation at the end of this document.

Upgrade Preparation Steps

Upgrades to the Sandbox environment are done a few weeks before the official release for more in depth testing. Checking the release docs, changelog, and general operational status for the various tools that are in use.

Sever Preparations

With the possibility of an upgrade to Spacewalk and to ensure the necessary software is installed prior to the upgrade, make sure all repositories are enabled and that the yum-plugin-versionlock software is installed.

Enable Repositories

Check the Spacewalk configuration and ensure that upgrades are coming from the local server and not from the internet.

Install yum versionlock

The critical components of Kubernetes are locked into place using the versionlock yum plugin. If not already installed, install it before beginning work.

# yum install yum-plugin-versionlock -y

Load Images

Next step is to load all the necessary Kubernetes, etcd, and additional images like coredns to the local repository so that all the clusters aren’t pulling all images from the internet. As a note, pause:3.1 has been upgraded to pause:3.2. Make sure you pull and update the image.

# docker pull k8s.gcr.io/etcd:3.4.13-0
3.4.13-0: Pulling from etcd
4000adbbc3eb: Pull complete
d72167780652: Pull complete
d60490a768b5: Pull complete
4a4b5535d134: Pull complete
0dac37e8b31a: Pull complete
Digest: sha256:4ad90a11b55313b182afc186b9876c8e891531b8db4c9bf1541953021618d0e2
Status: Downloaded newer image for k8s.gcr.io/etcd:3.4.13-0
k8s.gcr.io/etcd:3.4.13-0

# docker pull k8s.gcr.io/kube-apiserver:v1.19.6
v1.19.6: Pulling from kube-apiserver
f398b465657e: Pull complete
cbcdf8ef32b4: Pull complete
1ba2da83d184: Pull complete
Digest: sha256:5cf4a3622acbde74406a6b292d88e6d033070fc0f6e4cd50c13c182ba7c7a1ca
Status: Downloaded newer image for k8s.gcr.io/kube-apiserver:v1.19.6
k8s.gcr.io/kube-apiserver:v1.19.6

# docker pull k8s.gcr.io/kube-controller-manager:v1.19.6
v1.19.6: Pulling from kube-controller-manager
f398b465657e: Already exists
cbcdf8ef32b4: Already exists
22e45a96b75b: Pull complete
Digest: sha256:96c29073b29003f58faec22912aed45de831b4393eb4c8722fe1c3f5e4c296be
Status: Downloaded newer image for k8s.gcr.io/kube-controller-manager:v1.19.6
k8s.gcr.io/kube-controller-manager:v1.19.6

# docker pull k8s.gcr.io/kube-scheduler:v1.19.6
v1.19.6: Pulling from kube-scheduler
f398b465657e: Already exists
cbcdf8ef32b4: Already exists
8eda9f73d5d9: Pull complete
Digest: sha256:d96fdb88d032df719f6fb832aaafd3b90c688c216b7f8d3d01cd7f48664b6f37
Status: Downloaded newer image for k8s.gcr.io/kube-scheduler:v1.19.6
k8s.gcr.io/kube-scheduler:v1.19.6

# docker pull k8s.gcr.io/kube-proxy:v1.19.6
v1.19.6: Pulling from kube-proxy
4ba180b702c8: Already exists
85b604bcc41a: Pull complete
fafe7e2b354a: Pull complete
b2c4667c1ca7: Pull complete
c93c6a0c3ea5: Pull complete
beea6d17d8e9: Pull complete
9401490890f6: Pull complete
Digest: sha256:b0cb8f17f251f311da0d5681c8aa08cba83d85e6c520bf4d842e3c457f46ce92
Status: Downloaded newer image for k8s.gcr.io/kube-proxy:v1.19.6
k8s.gcr.io/kube-proxy:v1.19.6

# docker pull k8s.gcr.io/coredns:1.7.0
1.7.0: Pulling from coredns
c6568d217a00: Pull complete
6937ebe10f02: Pull complete
Digest: sha256:73ca82b4ce829766d4f1f10947c3a338888f876fbed0540dc849c89ff256e90c
Status: Downloaded newer image for k8s.gcr.io/coredns:1.7.0
k8s.gcr.io/coredns:1.7.0

# docker pull k8s.gcr.io/pause:3.2
3.2: Pulling from pause
c74f8866df09: Pull complete
Digest: sha256:927d98197ec1141a368550822d18fa1c60bdae27b78b0c004f705f548c07814f
Status: Downloaded newer image for k8s.gcr.io/pause:3.2
k8s.gcr.io/pause:3.2

# docker image ls
REPOSITORY                              TAG                 IMAGE ID            CREATED             SIZE
k8s.gcr.io/kube-proxy                   v1.19.6             dbcc366449b0        6 days ago          118MB
k8s.gcr.io/kube-apiserver               v1.19.6             5522f5e5fd7d        6 days ago          119MB
k8s.gcr.io/kube-controller-manager      v1.19.6             9dc349037b41        6 days ago          111MB
k8s.gcr.io/kube-scheduler               v1.19.6             bf39b6341770        6 days ago          45.6MB
k8s.gcr.io/etcd                         3.4.13-0            0369cf4303ff        3 months ago        253MB
k8s.gcr.io/coredns                      1.7.0               bfe3a36ebd25        6 months ago        45.2MB
k8s.gcr.io/pause                        3.2                 80d28bedfe5d        10 months ago       683kB

Next up is to tag all the images so they’ll be hosted locally on the bldr0cuomrepo1.internal.pri server.

# docker tag k8s.gcr.io/etcd:3.4.13-0 bldr0cuomrepo1.internal.pri:5000/etcd:3.4.13-0
# docker tag k8s.gcr.io/kube-apiserver:v1.19.6 bldr0cuomrepo1.internal.pri:5000/kube-apiserver:v1.19.6
# docker tag k8s.gcr.io/kube-controller-manager:v1.19.6 bldr0cuomrepo1.internal.pri:5000/kube-controller-manager:v1.19.6
# docker tag k8s.gcr.io/kube-scheduler:v1.19.6 bldr0cuomrepo1.internal.pri:5000/kube-scheduler:v1.19.6
# docker tag k8s.gcr.io/kube-proxy:v1.19.6 bldr0cuomrepo1.internal.pri:5000/kube-proxy:v1.19.6
# docker tag k8s.gcr.io/coredns:1.7.0 bldr0cuomrepo1.internal.pri:5000/coredns:1.7.0
# docker tag k8s.gcr.io/pause:3.2 bldr0cuomrepo1.internal.pri:5000/pause:3.2

# docker image ls
REPOSITORY                                                 TAG                 IMAGE ID            CREATED             SIZE
bldr0cuomrepo1.internal.pri:5000/kube-proxy                v1.19.6             dbcc366449b0        6 days ago          118MB
k8s.gcr.io/kube-proxy                                      v1.19.6             dbcc366449b0        6 days ago          118MB
bldr0cuomrepo1.internal.pri:5000/kube-controller-manager   v1.19.6             9dc349037b41        6 days ago          111MB
k8s.gcr.io/kube-controller-manager                         v1.19.6             9dc349037b41        6 days ago          111MB
bldr0cuomrepo1.internal.pri:5000/kube-scheduler            v1.19.6             bf39b6341770        6 days ago          45.6MB
k8s.gcr.io/kube-scheduler                                  v1.19.6             bf39b6341770        6 days ago          45.6MB
bldr0cuomrepo1.internal.pri:5000/kube-apiserver            v1.19.6             5522f5e5fd7d        6 days ago          119MB
k8s.gcr.io/kube-apiserver                                  v1.19.6             5522f5e5fd7d        6 days ago          119MB
bldr0cuomrepo1.internal.pri:5000/etcd                      3.4.13-0            0369cf4303ff        3 months ago        253MB
k8s.gcr.io/etcd                                            3.4.13-0            0369cf4303ff        3 months ago        253MB
bldr0cuomrepo1.internal.pri:5000/coredns                   1.7.0               bfe3a36ebd25        6 months ago        45.2MB
k8s.gcr.io/coredns                                         1.7.0               bfe3a36ebd25        6 months ago        45.2MB
bldr0cuomrepo1.internal.pri:5000/pause                     3.2                 80d28bedfe5d        10 months ago       683kB
k8s.gcr.io/pause                                           3.2                 80d28bedfe5d        10 months ago       683kB

The final step is to push them all up to the local repository.

# docker push bldr0cuomrepo1.internal.pri:5000/etcd:3.4.13-0
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/etcd]
bb63b9467928: Pushed
bfa5849f3d09: Pushed
1a4e46412eb0: Pushed
d61c79b29299: Pushed
d72a74c56330: Pushed
3.4.13-0: digest: sha256:bd4d2c9a19be8a492bc79df53eee199fd04b415e9993eb69f7718052602a147a size: 1372

# docker push bldr0cuomrepo1.internal.pri:5000/kube-apiserver:v1.19.6
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/kube-apiserver]
3721be488f60: Pushed
597f1090d8e9: Pushed
e7ee84ae4d13: Pushed
v1.19.6: digest: sha256:165196f6df4953429054bad29571c4aee1700c5d370f6a7c4415293371320ca0 size: 949

# docker push bldr0cuomrepo1.internal.pri:5000/kube-controller-manager:v1.19.6
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/kube-controller-manager]
d3d1d4836f26: Pushed
597f1090d8e9: Mounted from kube-apiserver
e7ee84ae4d13: Mounted from kube-apiserver
v1.19.6: digest: sha256:c6631f1624152013ec188ca11ce42580fe34bb83aaef521cdffc89909316207e size: 949

# docker push bldr0cuomrepo1.internal.pri:5000/kube-scheduler:v1.19.6
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/kube-scheduler]
b468c9e3b9f6: Pushed
597f1090d8e9: Mounted from kube-controller-manager
e7ee84ae4d13: Mounted from kube-controller-manager
v1.19.6: digest: sha256:dfd0c6ea6ea3ce2ec29dad98b3891495f5df8271ca21bca8857cfee2ad18b66f size: 949

# docker push bldr0cuomrepo1.internal.pri:5000/kube-proxy:v1.19.6
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/kube-proxy]
d4aabbee649e: Pushed
78dd6c0504a7: Pushed
061bfb5cb861: Pushed
1b55846906e8: Pushed
b9b82a97c787: Pushed
b4e54f331697: Pushed
91e3a07063b3: Mounted from kube-scheduler
v1.19.6: digest: sha256:c4c840cba79da1a61172f77af81173faf19a2f5ee58f3ff8be3ba68a279b14a0 size: 1786

# docker push bldr0cuomrepo1.internal.pri:5000/coredns:1.7.0
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/coredns]
96d17b0b58a7: Pushed
225df95e717c: Pushed
1.7.0: digest: sha256:242d440e3192ffbcecd40e9536891f4d9be46a650363f3a004497c2070f96f5a size: 739

# docker push bldr0cuomrepo1.internal.pri:5000/pause:3.2
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/pause]
ba0dae6243cc: Pushed
3.2: digest: sha256:4a1c4b21597c1b4415bdbecb28a3296c6b5e23ca4f9feeb599860a1dac6a0108 size: 526

Software Preparations

This section describes the updates that need to be made to the various containers that are installed in the Kubernetes clusters. Most of the changes involve updating the location to point to my Docker Repository vs pulling directly from the Internet.

You’ll need to clone if new, or pull the current playbook repo from gitlab as all the work will be done in various directories under the kubernetes/configurations directory. You’ll want to do that before continuing. All subsequent sections assume you’re in the kubernetes/configurations directory.

$ git clone git@lnmt1cuomgitlab.internal.pri:external-unix/playbooks.git
$ git pull git@lnmt1cuomgitlab.internal.pri:external-unix/playbooks.git

Make sure you add and commit the changes to your repo.

$ git add [file]
$ git commit [file] -m "commit comment"

And once done with all the updates, push the changes back up to gitlab.

$ git push

Update calico.yaml

In the calico directory, run the following command to get the current calico.yaml file.

$ curl https://docs.projectcalico.org/manifests/calico.yaml -O

Basically grep out the image lines and pull the new images down to the local repository in order to retrieve the images locally.

# docker pull docker.io/calico/cni:v3.17.1
v3.17.1: Pulling from calico/cni
3d42ab7fd2aa: Pull complete
a0a1a170563e: Pull complete
4d26d217f6ba: Pull complete
Digest: sha256:3dc2506632843491864ce73a6e73d5bba7d0dc25ec0df00c1baa91d17549b068
Status: Downloaded newer image for calico/cni:v3.17.1
docker.io/calico/cni:v3.17.1

# docker pull docker.io/calico/pod2daemon-flexvol:v3.17.1
v3.17.1: Pulling from calico/pod2daemon-flexvol
1099d2df204e: Pull complete
9aef96ab1093: Pull complete
583d1a1aef56: Pull complete
3eb06e0bf22b: Pull complete
8362899d2e86: Pull complete
9fb2be1a9d3e: Pull complete
c54d4908d08c: Pull complete
Digest: sha256:48f277d41c35dae051d7dd6f0ec8f64ac7ee6650e27102a41b0203a0c2ce6c6b
Status: Downloaded newer image for calico/pod2daemon-flexvol:v3.17.1
docker.io/calico/pod2daemon-flexvol:v3.17.1

# docker pull docker.io/calico/node:v3.17.1
v3.17.1: Pulling from calico/node
a019d9c0ce8b: Pull complete
fa31af8ad59c: Pull complete
Digest: sha256:25e0b0495c0df3a7a06b6f9e92203c53e5b56c143ac1c885885ee84bf86285ff
Status: Downloaded newer image for calico/node:v3.17.1
docker.io/calico/node:v3.17.1

# docker pull docker.io/calico/kube-controllers:v3.17.1
v3.17.1: Pulling from calico/kube-controllers
c36c2fad477a: Pull complete
38fb4366911a: Pull complete
d7deb0c84128: Pull complete
c710f3356d3b: Pull complete
Digest: sha256:d27dd1780b265406782578ae55b5ff885b94765a36b4df43cdaa4a8592eba2db
Status: Downloaded newer image for calico/kube-controllers:v3.17.1
docker.io/calico/kube-controllers:v3.17.1

Then tag the images for local storage.

# docker tag calico/cni:v3.17.1 bldr0cuomrepo1.internal.pri:5000/cni:v3.17.1
# docker tag calico/pod2daemon-flexvol:v3.17.1 bldr0cuomrepo1.internal.pri:5000/pod2daemon-flexvol:v3.17.1
# docker tag calico/node:v3.17.1 bldr0cuomrepo1.internal.pri:5000/node:v3.17.1
# docker tag calico/kube-controllers:v3.17.1 bldr0cuomrepo1.internal.pri:5000/kube-controllers:v3.17.1

Then push them up to the local repository.

# docker push bldr0cuomrepo1.internal.pri:5000/cni:v3.17.1
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/cni]
23a79ec53bb3: Pushed
40663f1967b3: Pushed
a13bf69d4d98: Pushed
v3.17.1: digest: sha256:4f8cbbaf93ef9c549021423ac804ac3e15e366c8a61cf6008b4737d924fe65e2 size: 946

# docker push bldr0cuomrepo1.internal.pri:5000/pod2daemon-flexvol:v3.17.1
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/pod2daemon-flexvol]
fee23ca43586: Pushed
db5b7a686992: Pushed
1e5330946944: Pushed
aeedaec3fa39: Pushed
d0dcbddd6708: Pushed
50295429f9b9: Pushed
4f676ac8854c: Pushed
v3.17.1: digest: sha256:0e63dd25602907c54e43f479d00ea83d7c4388f9a69b1457358ae043edbb56cd size: 1788

# docker push bldr0cuomrepo1.internal.pri:5000/node:v3.17.1
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/node]
3633b710791b: Pushed
40cad2715d16: Pushed
v3.17.1: digest: sha256:304dd23bcda5216026f1601cb61395792249f1c58c98771198f6e517b0f5c96b size: 737

# docker push bldr0cuomrepo1.internal.pri:5000/kube-controllers:v3.17.1
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/kube-controllers]
a945e41cb5e1: Pushed
cd7170f5d387: Pushed
b3cb3ad89824: Pushed
d7ecfe7ff366: Pushed
v3.17.1: digest: sha256:95b53efaad09a3d09f43c4f950a1675f932c25bd3781e4fa533c3a3f9a16958c size: 1155

Edit the file, search for image: and insert in front of calico, the path to the local repository.

bldr0cuomrepo1.internal.pri:5000/

Make sure you follow the documentation to update calicoctl to 3.17.1.

Update metrics-server

In the metrics-server directory, run the following command to get the current components.yaml file:

$ wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.4.1/components.yaml

Edit the file, search for image: and replace k8s.gcr.io with bldr0cuomrepo1.internal.pri:5000/

Download the new image and save it locally.

# docker pull k8s.gcr.io/metrics-server/metrics-server:v0.4.1
v0.4.1: Pulling from metrics-server/metrics-server
e59bd8947ac7: Pull complete
cdbcff7dade2: Pull complete
Digest: sha256:78035f05bcf7e0f9b401bae1ac62b5a505f95f9c2122b80cff73dcc04d58497e
Status: Downloaded newer image for k8s.gcr.io/metrics-server/metrics-server:v0.4.1
k8s.gcr.io/metrics-server/metrics-server:v0.4.1

Tag the image.

# docker tag k8s.gcr.io/metrics-server/metrics-server:v0.4.1 bldr0cuomrepo1.internal.pri:5000/metrics-server:v0.4.1

And push the newly tagged image.

# docker push bldr0cuomrepo1.internal.pri:5000/metrics-server:v0.4.1
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/metrics-server]
7f4d330f3490: Pushed
7a5b9c0b4b14: Pushed
v0.4.1: digest: sha256:2009bb9ca86e8bdfc035a37561cf062f3e051c35823a5481fbd13533ce402fac size: 739

Update kube-state-metrics

The kube-state-metrics package isn’t updated this quarter.

Update filebeat-kubernetes.yaml

In the filebeat directory, run the following command to get the current filebeat-kubernetes.yaml file:

$ curl -L -O https://raw.githubusercontent.com/elastic/beats/7.10.0/deploy/kubernetes/filebeat-kubernetes.yaml

Change all references in the filebeat-kubernetes.yaml file from kube-system to monitoring. If a new installation, create the monitoring namespace.

Update the local repository with the new docker image.

# docker pull docker.elastic.co/beats/filebeat:7.10.0
7.10.0: Pulling from beats/filebeat
f1feca467797: Pull complete
c88871268a93: Pull complete
10b07962f975: Pull complete
72140f4e331a: Pull complete
f0b0c2d74c55: Pull complete
f331a4a38275: Pull complete
85232249e0eb: Pull complete
cef8587fe8c4: Pull complete
0663fb8750a2: Pull complete
c573ab98e4ce: Pull complete
Digest: sha256:c8c612f37e093a4b7da6b0d5fbaf68a558642405d5be98c7ec76fb1169aa93fe
Status: Downloaded newer image for docker.elastic.co/beats/filebeat:7.10.0
docker.elastic.co/beats/filebeat:7.10.0

Tag the image appropriately.

# docker tag docker.elastic.co/beats/filebeat:7.10.0 bldr0cuomrepo1.internal.pri:5000/filebeat:7.10.0

Finally, push it up to the local repository.

# docker push bldr0cuomrepo1.internal.pri:5000/filebeat:7.10.0
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/filebeat]
ea58304a2317: Pushed
0dafc9982491: Pushed
27faaca5907d: Pushed
e4f67691198f: Pushed
b1e4fb67465f: Pushed
bb10e40dd1a4: Pushed
a80f7773385e: Pushed
3ccd96885c69: Pushed
565a72108ad2: Pushed
613be09ab3c0: Pushed
7.10.0: digest: sha256:16f8b41f68920f94fdc101e5af06c658d3a846168c0b76738097fd19cf6e32b3 size: 2405

Once the image is hosted locally, copy the file into each of the cluster directories and make the following changes.

DaemonSet Changes

In the filebeat folder are two files. A config file and an update file. These files automatically make changes to the filebeat-kubernetes.yaml file based on some of the changes that are performed below. The below changes are made to prepare for the script which populates the different clusters with correct information.

  • Switches the docker.elastic.co/beats image with bldr0cuomrepo1.internal.pri:5000
  • Replaces <elasticsearch> with the actual ELK Master server name
  • Switches the kube-system namespace with monitoring. You’ll need to ensure the monitoring namespace has been created before applying this .yaml file.
  • Replaces DEPLOY_ENV with the expected deployment environment name; dev, sqa, staging, or prod. These names are used in the ELK cluster to easily identify where the logs are sourced.

Change the values in the following lines to match:

        - name: ELASTICSEARCH_HOST
          value: "<elasticsearch>"
        - name: ELASTICSEARCH_PORT
          value: "9200"
        - name: ELASTICSEARCH_USERNAME
          value: ""
        - name: ELASTICSEARCH_PASSWORD
          value: ""

In addition, remove the following lines. They confuse the container if they exist.

        - name: ELASTIC_CLOUD_ID
          value:
        - name: ELASTIC_CLOUD_AUTH
          value:

Add the default username and password to the following lines as noted:

    output.elasticsearch:
      hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
      username: ${ELASTICSEARCH_USERNAME:elastic}
      password: ${ELASTICSEARCH_PASSWORD:changeme}

ConfigMap Changes

In the ConfigMap section, activate the filebeat.autodiscover section by uncommenting it and delete the filebeat.inputs configuration section. In the filebeat.autodiscover section, make the following three changes as noted with comments.

filebeat.autodiscover:
  providers:
    - type: kubernetes
      host: ${NODE_NAME}                          # rename node to host
      hints.enabled: true
      hints.default_config.enabled: false         # add this line
      hints.default_config:
        type: container
        paths:
          - /var/log/containers/*${data.kubernetes.container.id}.log
        exclude_lines: ["^\\s+[\\-`('.|_]"]  # drop asciiart lines  # add this line

In the processors section, remove the cloud.id and cloud.auth lines, add the following lines, and change DEPLOY_ENV to the environment filebeat is being deployed to: dev, sqa, staging, or prod.

   - add_fields:
       target: ''
       fields:
         environment: 'DEPLOY_ENV'

Elastic Stack in Development

This Elastic Stack cluster is used by the Development Kubernetes clusters. Update the files in the bldr0-0 subdirectory.

- name: ELASTICSEARCH_HOST
  value: bldr0cuomifem1.internal.pri

Elastic Stack in QA

This Elastic Stack cluster is used by the QA Kubernetes clusters. Update the files in the cabo0-0 directory.

- name: ELASTICSEARCH_HOST
  value: cabo0cuomifem1.internal.pri

Elastic Stack in Staging

This Elastic Stack cluster is used by the Staging Kubernetes clusters. Update the files in the tato0-1 directory.

- name: ELASTICSEARCH_HOST
  value: tato0cuomifem1.internal.pri

Elastic Stack in Production

This Elastic Stack cluster is used by the Production Kubernetes cluster. Update the file in the lnmt1-2 directory.

- name: ELASTICSEARCH_HOST
  value: lnmt1cuelkmstr1.internal.pri
Posted in Computers, Kubernetes | Tagged | Leave a comment

Kubernetes Upgrade to 1.19.6

Upgrading Kubernetes Clusters

The following lists what software and pods will be upgraded during this quarter.

  • Upgrade the Operating System
  • Upgrade Kubernetes
    • Upgrade kudeadm, kubectl, and kubelet RPMs from 1.18.8 to 1.19.6.
    • Upgrade kubernetes-cni RPM from 0.8.6-0 to 0.8.7-0.
    • kube-apiserver is upgraded from 1.18.8 to 1.19.6 automatically.
    • kube-controller-manager is upgraded from 1.18.8 to 1.19.6 automatically.
    • kube-scheduler is upgraded from 1.18.8 to 1.19.6 automatically.
    • kube-proxy is upgraded from 1.18.8 to 1.19.6 automatically.
    • coredns is upgraded from 1.6.7 to 1.7.0 automatically.
    • etcd is upgraded from 3.4.3-0 to 3.4.13.0 automatically.
    • pause is upgraded from 3.1 to 3.2 automatically.
  • Upgrade Calico from 3.16.0 to 3.17.1.
  • Upgrade Filebeat from 7.9.2 to 7.10.0.
  • Upgrade docker from 1.3.1-162 to 1.13.1-203.
  • metrics-server is upgraded from 0.3.7 to 0.4.1.

Unchanged Products

The following products do not have an upgrade this quarter.

  • kube-state-metrics remains at 1.9.7.

Upgrade Notes

The following notes provide information on what changes might be affecting users of the clusters when upgrading from one version to the next. The notes I’m adding reflect what I think relevant to our environment so no discussions on Azure although I might call it out briefly. For more details, click on the provided links. If you find something you think relevant, please let me know and I’ll add it in.

Kubernetes Core

The following notes will reflect changes that might be relevant between the currently installed 1.18.8 up through 1.19.6, the target upgrade for Q1. While I’m trying to make sure I don’t miss something, if you’re not sure, check the links to see if any changes apply to your product/project. Reminder that many of the 1.18 updates are the same as the 1.19 updates. As 1.19 is updated and patched, similar 1.18 releases address the same patches.

  • 1.18.9 – Nothing of interest.
  • 1.18.10 – Nothing of interest.
  • 1.18.11 – Nothing of interest.
  • 1.18.12 – Nothing of interest.
  • 1.18.13 – Nothing of interest.
  • 1.18.14 – Nothing of interest.
  • 1.19.0 – Since this had a longer lead time, there are a lot of changes. Other than what’s noted below, nothing jumped out that was causing problems. There are enough new features that you should review this section in case something jumps out that is interesting to you.
    • Expanded CLI support for debugging. Can change the image to a different one, busybox for example. Can change a command to sleep for 1 day for example so there’s time to kubectl exec to debug a container.
    • Insert a debug container in a pod.
    • Ingress API is now GA.
    • Kubernetes images are now stored in {asia,eu,us}.gcr.io/k8s-artifacts-prod and not in k8s.gcr.io. Artifactory will need to be updated.
    • Kubernetes support has changed from 9 months (3 releases) to a year (4 releases).
      • Deprecated APIs:apiextensions.k8s.io/v1beta1 for apiextensions.k8s.io/v1
      • apiregistration.k8s.io/v1beta1 for apiregistration.k8s.io/v1
      • authentication.k8s.io/v1beta1 for authentication.k8s.io/v1
      • authorization.k8s.io/v1beta1 for authorization.k8s.io/v1
      • autoscaling/v2beta1 for autoscaling/v2beta2
      • coordination.k8s.io/v1beta1 for coordination.k8s.io/v1
      • storage.k8s.io/v1beta1 for storage.k8s.io/v1
      • networking.k8s.io/v1beta1 for networking.k8s.io/v1
  • 1.19.1 – Nothing of interest.
  • 1.19.2 – Nothing of interest.
  • 1.19.3 – Nothing of interest.
  • 1.19.4 – Nothing of interest.
  • 1.19.5 – Nothing of interest.
  • 1.19.6 – Nothing of interest.

kubernetes-cni

Running rpm -q –changelog kubernetes-cni

  • Updated CNI-Tools to 0.8.7

etcd

Generally there isn’t anything that is impacting however a review of changes is always encouraged. As with all other changelog reviews, the items below are either bugs I’ve experienced or new features that look interesting. As always, feel free to follow the links to see what might be of interest to you.

coredns

  • 1.6.8 – Updates to plugin: azure, cache, forward (3), hosts, kubernetes (2), metrics, and pkg/up.
  • 1.7.0 – Lots of metrics name changes. New DNS64 IPv6 plugin. Plugin updates: azure, dns64, federation (removed), forward (2), k8x_external, kubernetes (4), nsid.

Calico

The major release notes are on a single page. Versions noted here to describe the upgrade for each version. For example, 3.16.1 through 3.16.5 all point to the 3.16 Release Notes. Here I’m describing the changes, if relevant, between the point releases.

Note that we’re not currently using many of the features of Calico yet so improvements, changes, and fixes for Calico issues aren’t likely to impact any current services.

  • 3.16.1 – Nothing of interest.
  • 3.16.2 – Nothing of interest.
  • 3.16.3 – Nothing of interest.
  • 3.16.4 – Nothing of interest.
  • 3.16.5 – Nothing of interest.
  • 3.17.0 – Changed from MTU of 1440 to MTU of 0 due to automatic determination of MTUs now.
  • 3.17.1 – Nothing of interest.

Filebeat

  • 7.9.3 – Nothing of interest.
  • 7.10.0 – Nothing of interest.

docker

Run rpm -q –changelog docker

  • 1.13.1-200 – rebuilt
  • 1.13.1-201 – fix “Race condition in kubelet cgroup destroy process” – Resolves: #1766665
  • 1.13.1-202 – fix “runc run: fix panic on failed init start” – Resolves: #1879425
  • 1.13.1-203 – do not enable CollectMode support yet because it is not still present in 7.6-ALT – Related: #1766665

metrics-server

  • 0.4.0 – Nothing of interest.
  • 0.4.1 – Nothing of interest.

kube-state-metrics

Stays at the current version. v2.0.0 is in alpha and beta depending on your use-case.

References

Posted in Computers, Kubernetes | Tagged | Leave a comment

Recabling The Cluster

Background, back in the day we had slow network speeds of 10 Megabits per second (10 Mb/s). The specification is UTP Category 3 shortened to Cat3. That’s about half the speed of normal home type WiFi. (We’re not counting dial-up modem connections.)

That moved up to 100 Mb/s (Cat5) which is obviously 10 times faster than 10 Mb/s. And more current is 1000 Mb/s or 1 Gb/s (Gigabit Ethernet) (Cat5E) and even 10000 Mb/s or 10 Gb/s (Cat6). Super fast.

For servers in a cluster, a 10 Gb/s network card means I can move data between servers at very high speeds. Most desktop connections nowadays are likely at the 1 Gb/s range.

Back in the day, I’d also cut and crimp connectors on my own cables. It was somewhat expensive to actually buy a network cable and since I was doing computer geek stuff anyway, I had the tools and connectors and even have a spool of cable.

With the servers I have, I recently tried cutting and crimping and I’m just not that patient any more and since cables are fairly inexpensive nowadays, I simply spent $50 or so bucks for 30 Cat5E cables.

Network cables typically have 4 pairs of individual wires (8 wires). Cat5 specifications call for just 2 pair to be properly configured which is the specification for 100 Mb/s. Most of the time all 4 pairs are set up properly so you can get 1 Gb/s but it’s a crap-shoot.

I wasn’t thinking about it personally and labeled and plugged in the necessary cables into my servers. Some were 100Mb/s and some were 1Gb/s. It’s a “homelab” so it wasn’t a big deal and again, I wasn’t thinking too hard about it.

However. With me adding new servers, I’ve had to copy a bunch of files from server to server and it’s taking a bloody long time in some cases. After some poking around on the internet to get the right information, I found that Cat6 specifically is configured to have all 4 pairs properly set up to support up to 10 Gb/s.

And Cat6 cables are just as inexpensive as Cat5E cables. Yep, ordered sufficient Cat6 cables to replace my existing Cat5E cables, which will likely join the rest of the orphan cables sitting in crates in the garage until I think to recycle them.

Posted in Computers | Tagged , , | Leave a comment

Backing Up A vCenter Appliance

It’s actually built into the 6.5 vCenter Appliance where you can back up the server data. Log in to https://appliance:5480 and under Summary select Backup.

Couple of interesting tips as I proceeded though.

Backup logs are located in /var/log/vmware/applmgmt as backup.log. Could help determine why a backup isn’t working.

In the Appliance, check your settings. My DNS server and Default Gateway were incorrect and no Time Server was set. I made those updates however was getting an error for the Default Gateway. I was able to correct it on the command line of the server though.

# /opt/vmware/share/vami/vami_config_net

 Main Menu

0)      Show Current Configuration (scroll with Shift-PgUp/PgDown)
1)      Exit this program
2)      Default Gateway
3)      Hostname
4)      DNS
5)      Proxy Server
6)      IP Address Allocation for eth0
Enter a menu number [0]: 2

Warning: if any of the interfaces for this VM use DHCP,
the Hostname, DNS, and Gateway parameters will be
overwritten by information from the DHCP server.

Type Ctrl-C to go back to the Main Menu

0)      eth0
Choose the interface to associate with default gateway [0]:
Gateway will be associated with eth0
IPv4 Default Gateway [192.168.1.254]:
IPv6 Default Gateway []:
Reconfiguring eth0...
net.ipv6.conf.eth0.disable_ipv6 = 1
Network parameters successfully changed to requested values

 Main Menu

0)      Show Current Configuration (scroll with Shift-PgUp/PgDown)
1)      Exit this program
2)      Default Gateway
3)      Hostname
4)      DNS
5)      Proxy Server
6)      IP Address Allocation for eth0
Enter a menu number [0]: 1

For the actual backup, the Location field needs a bleeding backslash. And it’s absolute. So 192.168.104.60/home/cschelin/vcenter/backups. The process also creates the directory, mkdir -p /home/cschelin/vcenter/backups.

The other error was the statsmonitor wasn’t running. A bit of hunting and found that as well.

# service-control --start vmware-statsmonitor
Perform start operation. vmon_profile=None, svc_names=['vmware-statsmonitor'], include_coreossvcs=False, include_leafossvcs=False
2020-11-25T02:39:26.028Z   Service statsmonitor state STOPPED

Successfully started service statsmonitor

And once that was done, I had a successful backup of the database.

Posted in Computers, VMware | Tagged , | Leave a comment