Ansible Tags – A Story

Started a new job back in October. The team is just me and another guy and the boss. And the other guy quit in December.

The real good thing is it’s a small single project shop and pretty much all the server work is done with Ansible so lots of playbooks. Of course the bad thing is it’s just me so I’m dissecting the playbooks to see what the previous folks did and why.

One of the things is the use of Tags. There are defined tags in several places but in the calling playbook and apparently not used when running the playbook or in the roles. It’s not defined in any documentation (what little there is) and the playbooks themselves don’t seem to need the tags.

I pulled up the Ansible docs on tags, checked a couple of youtube videos and an O’Reilly book and really didn’t see a need for Tags. Anything large enough where Tags might be useful probably should be broken down into smaller tasks anyway.

Then the boss made a request. We’re changing the IPs in the load balancer and the load balancer IP and I’d like it done via Ansible.

My first attempt was a task with a list of old IPs and a second task with a list of the new IPs. Use with_items and go. Added a backout task in case there was a problem that just reversed the lists.

Boss updated the request. We bring down Side A first, test to make sure it’s good, then Side B. A sequential list of tasks vs just delete and add. Okay, let’s see…

Started creating a bunch of little playbooks in part because of a manual check between changes.

  • Remove Side A from the Load Balancer
  • Remove the old IP from Side A
  • Add the new IP to Side A
  • Validate
  • Add Side A back to the Load Balancer
  • Remove Side B from the Load Balancer
  • Remove the old IP from Side B
  • Add the new IP to Side B
  • Validate
  • Add Side B back to the Load Balancer
  • Validate

So three playbooks. Well, let’s not forget creating similar playbooks to back out the change in case Validate == Failed. So three more playbooks. Plus a couple of edge cases. For example, if Side A is fine but there’s some network issue with Side B, backing out Side B might mean three of the backout tasks can be run but we’d want to leave the new Side A in the Load Balancer.

That’s a lot of playbooks.

Hey, Tags! Create one Update playbook and tag the tasks appropriately. Then a second Backout playbook and tag those tasks. Then run the Update playbook with –tags delsidealb,delsidea,addsidea.

So not necessarily a long playbook but also for a bunch of simple tasks that need backouts and manual verifications.

Well, I thought it was cool 🙂 Learning new things is always fun and I thought I’d share.

Posted in ansible, Computers | Tagged , | Leave a comment

Ansible Tags

Overview

Simply enough, Ansible Tags let you run specific tasks in a play. If you have a lengthy playbook or are testing tasks within a playbook, you can assign tags to tasks that let you run a specific task vs the entire playbook.

This is simply a summary of the uses of Ansible Tags. More of a cheat sheet than trying to instruct you in how to use Ansible Tags. The Ansible Tags Documentation is fairly short and does a good job explaining how to use Ansible Tags.

Uses

Examples

$ ansible-playbook -i inventory dns-update.yaml --tags bind9               # only run tasks tagged with bind9
$ ansible-playbook -i inventory dns-update.yaml --skip-tags bind9          # run all tasks except the ones tagged with bind9
$ ansible-playbook -i inventory dns-update.yaml --tags "bind9,restart"     # run tasks tagged with bind9 and restart
$ ansible-playbook -i inventory dns-update.yaml --tags untagged            # only run untagged tasks
$ ansible-playbook -i inventory dns-update.yaml --tags tagged              # only run tagged tasks
# ansible-playbook -i inventory dns-update.yaml --tags all                 # run all tasks (default)

You can assign a tag to one or more tasks.

Tasks can have multiple tags.

When you create a block of tasks, you can assign a tag to that block and all tasks within the block are run when the tag is used.

An interesting idea might be to add a debug tag to all the debug statements in your playbooks and then when ready to run live, pass the –skip-tags debug flag to the playbook. Then only the tasks are executed.

Special Tags

If you assign an always tag to a task, it will always run no matter what the passed –tags value is unless you specially pass –skip-tags always.

If you assign a never tag to a task, it will not run unless you call it out specifically. Something like calling the playbook with –tags all,never.

Tag Inheritance

There are two types of statements that add tasks. A Dynamic include_role, include_tasks, and include_vars, and a Static import_role and import_tasks.

If you tag a task that contains an include_role or include_tasks function, only tasks within that included file that are similarly tagged will run when the tag is passed.

If you tag a task that contains an import_role or import_tasks function, all tasks within that imported file will be run when the tag is passed.

Listing Tags

By using the –list-tags option to ansible-playbooks, it lists all the tags and exits the playbook without running anything.

References

There are several sites that provide information on tags but the obvious one is the Ansible Documentation

Posted in ansible, Computers | Tagged , | Leave a comment

Ansible Handlers

Overview

Ansible Handlers are tasks that are only performed when a calling task has successfully changed something.

Updating Docker

Say for example you want to try and update docker. There isn’t always an update available but if there is an update and the server is updated, docker needs to be restarted.

In the roles/docker/tasks directory, the main.yaml file looks like:

---
- name: update docker
  yum:
    name: docker
    state: latest
  notify:
  - Restart docker

In the roles/docker/handlers directory, the main.yaml file looks like:

---
- name: Restart docker
  systemd:
    daemon_reload: yes
    name: docker
    state: restarted

Notice in the first code block the notify line followed by the Handler to call, Restart docker. Note that only one task with that name can exist in the namespace and if called, only the last named Handler will be called and only one time.

If docker can be updated, the following example shows the results. Note the changed at the start of the lines indicating an upgrade or restart has occurred. The ok indicates no change was performed.

PLAY [kube-bldr0-0-worker] ***********************************************************************************************************************************
 
TASK [Gathering Facts] ***************************************************************************************************************************************
ok: [bldr0cuomknode1]
ok: [bldr0cuomknode3]
ok: [bldr0cuomknode2]
 
TASK [docker : update docker] ***************************************************************************************************************************
changed: [bldr0cuomknode1]
changed: [bldr0cuomknode2]
changed: [bldr0cuomknode3]
 
RUNNING HANDLER [docker : Restart docker] *********************************************************************************************************************
changed: [bldr0cuomknode1]
changed: [bldr0cuomknode2]
changed: [bldr0cuomknode3]
 
NO MORE HOSTS LEFT *******************************************************************************************************************************************
 
PLAY RECAP ***************************************************************************************************************************************************
bldr0cuomknode1            : ok=3    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
bldr0cuomknode2            : ok=3    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
bldr0cuomknode3            : ok=3    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

If you run the playbook again, the Handler will not be called as an update isn’t necessary. Again, note the ok at the start of the lines indicating no change occurred.

PLAY [kube-bldr0-0-worker] ***********************************************************************************************************************************
 
TASK [Gathering Facts] ***************************************************************************************************************************************
ok: [bldr0cuomknode1]
ok: [bldr0cuomknode2]
ok: [bldr0cuomknode3]
 
TASK [docker : update docker] ***************************************************************************************************************************
ok: [bldr0cuomknode1]
ok: [bldr0cuomknode2]
ok: [bldr0cuomknode3]
 
PLAY RECAP ***************************************************************************************************************************************************
bldr0cuomknode1            : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
bldr0cuomknode2            : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
bldr0cuomknode3            : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

References

Posted in ansible, Computers | Tagged , | Leave a comment

Configuring Katello 3.15

I’m currently using Spacewalk to manage patches on my servers however it doesn’t support CentOS 8 and I suspect there’ll be more and more issues. Plus my previous job used Satellite and the current one wants me to install Katello so I’ve installed it on one of my VMs.

The idea here is to get the main configuration done in such as way as I’m ready to configure Products, Repositories, and start Syncing to be ready to update all the servers. Most of the Katello instructions are along the lines of how to use the various options but not order or things to make sure are done.

Katello already has an Organization when you installed it. You’ll likely want to rename it to something appropriate.

Click on Administer -> Organizations. Click on the Default one to edit it and rename it. Then click Submit.

Next, add the various locations. I have multiple pseudo data centers that align with environments. Click on Administer -> New Location, enter in the Name and a Description, and click Submit. When the Detail page is displayed, click on Organizations and add the new organization to the location.

Once Organizations is edited and all the Locations are added, the rest of the configurations will have you adding them to Subnets and Domains.

Next up, add the various Puppet Environments. Click on Configure -> Environments and click on the Create Puppet Environment button to create a new Environment. Enter in the Name, click on Locations and add the Environment to the correct Location then make sure the Environment is associated with the Organization. Click Submit to save your changes.

Verify the Domain by clicking on Infrastructure -> Domains. Add a Description if missing, associate the Domain with Locations and the Organization, and then click Submit to save.

Next, add the necessary Subnets. This is a bit more involved. Click Infrastructure -> Subnets then click on the Create Subnet button.

Fill out the Name of the Subnet, a Description, Network Address and Prefix, Mask, Gateway Address, DNS Servers, and VLAN ID. You can select an IPAM option and Katello will try to anticipate the IP Addresses when they’re added. Change to the Domain tab and update that. Click on Locations and add the necessary ones. And make sure the Subnet is added to the correct Organization.

One last step is to go through the above settings and make sure all the fields are filled in. Organization, Locations, Domains, etc.

Posted in Computers | Tagged , , | Leave a comment

Out Of Memory Killer

In Linux, a process is in place called the OOM Killer that kills processes when memory starts getting too low.

While this is an interesting idea, it can kill your application before it kills something that might be a lower priority.

Of course, the best solution is to add more memory to a server. But if it’s not immediately possible, you can make some changes to memory management and the OOM Killer to make sure your application has the highest priority for memory usage.

In the /proc/[process id]/oom_adj file, is the rating of a process. By default, every process has a 0 (zero) rating. As time goes by, some processes gain priority and this file will drop. A quick look at one of my servers show a lot of zeros, a couple of -4’s, a bunch of -15’s, and a few -17’s. Based on the configuration of OOM Killer, any process with a zero rating will be killed before the -4, -15, or -17, which will be the last one killed.

In order to ensure the application has the highest priority, make a list of the processes (not process IDs) that are a lower priority such as monitoring or backup agents, and whip up a script that runs regularly. The script retrieves the PID of the listed process and sets the oom_adj value to 100. This ensures this lower priority process is killed before a higher, more important application is touched. I use 100 however it can be anything greater than zero.

#!/bin/bash

for i in $(oompriority)
do
  OOMPID=$(ps -e | awk '/${i}/{print $1}')
  if [[ ${OOMPID} -gt 0 ]]
  then
    echo 100 > /proc/${OOMPID}/oom_adj
  fi
done


Posted in Computers | Tagged | Leave a comment

Kubernetes Delete an etcd Member

On the Kubernetes cluster, one of the etcd members had a falling out and is reporting the data is stale. While troubleshooting, we came up with several ideas including just rebuilding the cluster. It’s not all that hard overall but still causes some angst because everyone gets new tokens and applications have to be redeployed.

The process itself is simple enough.

etcdctl member list
etcdctl member remove [member hex code]

Since it’s a TLS based node with certificates, you actually have to pass the certificate information on the command line. In addition, you may actually have to go into the pod to use its etcdctl command if you don’t have a current etcdctl binary installed.

The command is the same though, whether you’re in the pod itself (easy to do from a central console) or running it on one of the masters where the etcd certs are also installed.

etcdctl member list --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key
59721c313837f64a, started, bldr0cuomkube3.internal.pri, https://192.168.101.71:2380, https://192.168.101.71:2379, false
cd0ea44e64569de6, started, bldr0cuomkube2.internal.pri, https://192.168.101.73:2380, https://192.168.101.73:2379, false
e588b22b4be790ad, started, bldr0cuomkube1.internal.pri, https://192.168.101.72:2380, https://192.168.101.72:2379, false

Then you simply run the command again.

etcdctl member remove e588b22b4be790ad --cacert=/etc/kubernetes/pki/etcd/ca.crt --cert=/etc/kubernetes/pki/etcd/server.crt --key=/etc/kubernetes/pki/etcd/server.key

And the etcd member has been removed.

Posted in Computers, Kubernetes | Tagged , | Leave a comment

Turkey Pot Pie

This was a real good recipe I found on line when we had to do something with 22 lbs of turkey. We’ve been getting pot pies lately from the store so it was already something on my mind and I figured, let’s see what I can do.

Ingredients

  • 2 cups frozen peas and carrots or other mixed vegetables.
  • 2 cups frozen green beans
  • 1 cup chopped celery
  • 2/3 cup of butter (1 1/3 sticks)
  • 2/3 cup chopped onion
  • 2/3 cup all-purpose flour
  • 1 teaspoon salt
  • 1 teaspoon pepper
  • 1/2 teaspoon celery seed (I’ve had these around for quite some time)
  • 1/2 teaspoon onion powder
  • 1/2 teaspoon Italian seasoning
  • 1 3/4 cups of chicken broth
  • 1 1/3 cups milk
  • 4 cups cubed and cooked turkey meat. Half light, half dark.
  • 4 9 inch unbaked pie crusts. I used the Pillsbury ones.

Step 1 – Preheat the oven to 425F.

Step 2 – Cook the frozen vegetables and the chopped celery into a medium/large sauce pan. Boil and then simmer until the celery is tender. About 8 to 10 minutes. Drain the vegetables and set aside.

Step 3 – Melt the butter in the saucepan over medium heat, add the onion and cook until translucent, about 5 minutes. Then stir in the flour, salt, black pepper, celery seed, onion powder, and Italian seasoning. Then slowly whisk in the chicken broth and milk until the mixture thickens. It was pretty liquid for most and then I turned around for about 30 seconds and whomp, it was thickened. At that point, remove from the heat and stir in the cooked vegetables and turkey meat until well combined.

Step 4 – Fit two of the pie crusts into the bottom of the pie dishes. Spoon in half the vegetable and turkey filling into each of the dishes and cover with the second pie crust. As with a regular pie, pinch the edges together all the way around. You might cut a couple of slits into the top. I didn’t and it seemed to be fine.

Step 5 – Bake in the oven until the crusts are nice and brown. Between 30 and 35 minutes. Cover with aluminum foil if the tops are getting brown too fast. Once done, cool for about 10 minutes and serve.

Posted in Cooking | Tagged , , | Leave a comment

State of the Game Room 2020

My yearly list of what has arrived in the game room and maybe even what we’ve played over the past year.

History

Statistics

  • 153 New Entries in the Inventory database since December 30th 2019.
  • 22 Arkham Horror: The Card Game additions.
  • 30 Shadowrun additions.
  • 11 Dungeons & Dragons additions.
  • 17 other RPG purchases.
  • 27 new and used Board Games.
  • 13 new and used Card Games.

I say New and Used in a few places. This is because of two events over the past year. First is a friend and fellow gamer moving away from Colorado and back east to Virginia. Wen is a consummate gamer and an all around great guy. We miss him in Colorado. As part of his departure, he was selling off some gaming gear. As someone always on the lookout for games, I headed down and looked over his gear. While I did get a pretty good stack (filled up the trunk on the motorcycle), the ones of note are Formula E, which is Elephant racing! 😀 and an old game I used to have as a kid, Stratego.

Shadowrun

The second event was helping a fellow gaming who was having some personal troubles. He was selling some of his older, and hopefully not in use gear and making it available to the Shadowrun Facebook group. Bull is a pretty well known Ork so many folks stepped right on up. Generally I picked up some of this and some of that. In particular a Nerps pack of cards, a Shadowrun poster, Leviathan, and a few other bits.

Speaking of Shadowrun, I have a few interesting items this year. As part of a display of gear for the Shadowrun Facebook group, I snapped a pic of the crazy number of Limited Edition books that came out for Shadowrun 5th Edition. And it turns out I missed two for which I was able to easily track down. One from Catalyst itself and one on Amazon. And in addition, I snagged the Executive Edition for 6th.

The other picture was of all the miscellaneous Shadowrun kit like dice ($150 a pack!!!), pins, glasses, and posters. I even have a Chessex Vinyl Shadowrun cover.

A friend from work happened to get two copies of the Shadowrun Sprawl Ops board game and he gifted it to me knowing I play. We’ve had several discussions on Shadowrun before he was laid off.

But the best was tracking down the last two cards from the Denver Box Set. There are 6 plastic passes for use to travel between the sections of Denver and each box only has 2 meaning you had to hunt for the correct box. I was able to get two more from eBay back in 2006 when I got back into gaming and then stumbled upon someone selling a box with the last ones I needed, which included the Aztlan pass. This gave me a complete set of cards!

Due to the Covid Virus and mask requirement, when the Shadowrun Masks became available, I picked up 5 sets along with other bits like dice and an S Shadowrun pin.

Over the past year, Jamie from Atomic Goblin Games in Longmont Colorado, dumped some of his extra stuff that was just sitting around into my lap. As such I acquired several Netrunner bits as part the tournament kit and a pack of Star Wars X-Wing tournament bits. He’s also given me two sets of the Shadowrun miniatures and card decks as he gets them from Catalyst for free. He also got some Dark Souls miniatures expansions that I was able to get for his cost since I was the only one that played it.

The last bit of gear was from a posting on the work communications system (WebEx). We’d played some Munchkin this past year and I mentioned it and someone said there was a Munchkin RPG. Oh really. I was able to track down the available books and picked it up. Very fun reading.

reddit Questions

Several things this year has reduced the number of games we wanted to play. We were doing a lot of house hunting and finally purchased a house in August. There were a lot of different requirements and deadlines but we got it done. Add in all the moving hassle and less time was available for gaming. While the band was able to come up on weekends to practice, due to the drummer’s job change, we lost a lot of gaming time. Then Colorado went Red so we couldn’t even have guests over. Jeanne and I did get some gaming in. We both changed jobs towards the end of the year which had us doing more work related stuff; getting up to speed on the different technology for example.

Blog and Photos

Link to my blog where I go into more detail and have more pictures.

How Long Have You Been Gaming?

Well, I’m 63 now and started playing various games as a kid. My grandfather played gin and gin rummy with me when I was over and the adults played pinochle although I was able to play from time to time as well. We played all the standard games. Monopoly, Battleship, and even Chess. We started gaming more when we started doing Family Home Evening (Mormon thing) and I was introduced to Outdoor Survival, an Avalon Hill game. From there into wargaming and beyond!

Gaming This Year

We played quite a few games this year, in part due to Covid. Formula De, Raccoon Tycoon, Resident Evil, Munchkin, The Witches, Ticket to Ride (Rails and Sails), Splendor,The Doom That Came To Atlantic City, Car Wars,Nuclear War, Shadowrun Sprawl Ops, and Savage Worlds: Deadlands.

Favorite Board/Card Games

Of the past years plays: Probably the Resident Evil card game. That surprised me as being a pretty good game.

More current games: Ticket to Ride: Rails and Sails, Castles of Burgundy, Discoveries of Lewis and Clark, Bunny Kingdom, Formula De and Splendor.

Older: Car Wars, Cosmic Encounters, Nuclear War, and Ace of Aces.

Incoming and Outgoing

Generally Jamie down at Atomic Goblin Games will pick some out for me to check out. Other than Shadowrun books and paraphernalia, the only thing coming in is via Kickstarter and it’s the Steve Jackson Car Wars game.

As to outgoing, I just don’t do that. I was close to selling off a bunch of my collection back in the 90’s when I got into video games but I backed out and have since see many people who regretted getting rid of this game or that. I have the room for the gear, so it stays. Maybe next year. 🙂

Game Room Pictures

As to the game room, I did pick up another couple of Ikea Kallex boxes. A 4×4 one and a 2×4 one which I put on top of the 4×4 one resulting in a 6×4 configuration. Currently I have 4 5×5 shelves with a 1×4 shelf on top of each, 2 4×4 shelves with a 4×2 on top of each, a 1×4 shelf, and 2 2×4 shelves with a 2×2 on top of each for a total of 192 Kallex squares of games.

And Pictures! These are going from entrance left side clockwise around the room.

All of the pictures are linked here if you want to see bigger ones. Game On!

Posted in Gaming | Leave a comment

Kubernetes Ansible Upgrade to 1.19.6

Upgrading Kubernetes Clusters

This document provides a guide to upgrading the Kubernetes clusters in the quickest manner. Much of the upgrade process can be done using Ansible Playbooks. There are a few processes that need to be done centrally on the tool server. And the OS and control plane updates are also manual in part due to the requirement to manually remove servers from the Kubernetes API pool.

In most cases, examples are not provided as it is assumed that you are familiar with the processes and can perform the updates without having to be reminded of how to verify.

For any process that is performed with an Ansible Playbook, it is assumed you are on the lnmt1cuomtool11 server in the /usr/local/admin/playbooks/cschelin/kubernetes directory. All Ansible related steps expect to start from that directory. In addition, the application of pod configurations will be in the configurations subdirectory.

Perform Upgrades

Patch Servers

In the 00-osupgrade directory, you’ll be running the master and worker scripts. I recommend opening two windows, one for master and one for worker, and running each script with master -t [tag] and worker -t [tag]. This will verify a node is Ready, drain the node from the pool if a worker, perform a yum upgrade and reboot, uncordon again if a worker, and verify the nodes are Ready again. Should a node fail to be ready in time, the script will exit.

Update Versionlock And Components

In the 03-packages directory, run the update -t [tag] script. This will install yum-plugin-versionlock if missing, remove old versionlocks, create new versionlocks for kubernetes, kubernetes-cni, and docker, and then the components will be upgraded.

Upgrade Kubernetes

Using the kubeadm command on the first master server, upgrade the first master server.

# kubeadm upgrade apply 1.19.6

Upgrade Control Planes

On the second and third master, run the kubeadm upgrade apply 1.19.6 command and the control plane will be upgraded.

Update kube-proxy

Check the kube-proxy daemonset and update the image tag if required.

$ kubectl edit daemonset kube-proxy -n kube-system
  • Change image switching k8s.gcr.io with bldr0cuomrepo1.internal.pri:5000

Save the changes

Update coredns

Check the coredns-deployment and update the image tag if required.

$ kubectl edit deployment corednss -n kube-system
  • Change image switching k8s.gcr.io with bldr0cuomrepo1.internal.pri:5000

Save the changes.

Restart kubelet and docker

In the 04-kubelet directory, run the update -t [tag] script. This will restart kubelet and docker on all servers.

Calico Upgrade

In the configurations/calico directory, run the following command:

$ kubectl apply -f calico.yaml

calicoctl Upgrade

Pull the updated calicoctl binary and copy it to /usr/local/bin.

$ curl -O -L  https://github.com/projectcalico/calicoctl/releases/download/v3.16.0/calicoctl

Update File and Directory Permissions and Manifests

In the postinstall directory, run the update -t [tag] script. This will perform the following steps.

  • Add the cluster-name to the kube-controller-manager.yaml file
  • Update the imagePullPolicy and image lines to all manifests
  • Add the AlwaysPullImages and ResourceQuota admission controllers to the kube-apiserver.yaml file.
  • Update the permissions of all files and directories.

Filebeat Upgrade

In the configurations directory, change to the appropriate cluster context directory, bldr0-0, cabo0-0, tato0-1, and lnmt1-2 and run the following command.

$ kubectl apply -f filebeat-kubernetes.yaml

Posted in Computers, Kubernetes | Tagged | Leave a comment

Kubernetes Manual Upgrade to 1.19.6

Upgrading Kubernetes Clusters

This documentation is intended to provide the manual process for upgrading the server Operating Systems, Kubernetes to 1.19.6, and any additional updates. This provides example output and should help in troubleshooting should the automated processes experience a problem.

All of the steps required to prepare for an installation should be completed prior to starting this process.

Server and Kubernetes Upgrades

Patch Servers

As part of quarterly upgrades, the Operating Systems for all servers need to be upgraded.

For the control plane, there isn’t a “pool” so just patch each server and reboot it. Do one server at a time and check the status of the cluster before moving to subsequent master servers in the control plane.

For the worker nodes, you’ll need to drain each of the workers before patching and rebooting. Run the following command to both confirm the current version of 1.18.8 and that all nodes are in a Ready state to be patched:

$ kubectl get nodes
NAME                           STATUS   ROLES    AGE   VERSION
bldr0cuomknode1.internal.pri   Ready    <none>   90d   v1.18.8
bldr0cuomknode2.internal.pri   Ready    <none>   90d   v1.18.8
bldr0cuomknode3.internal.pri   Ready    <none>   90d   v1.18.8
bldr0cuomkube1.internal.pri    Ready    master   90d   v1.18.8
bldr0cuomkube2.internal.pri    Ready    master   90d   v1.18.8
bldr0cuomkube3.internal.pri    Ready    master   90d   v1.18.8

To drain a server, patch, and then return the server to the pool, follow the steps below:

$ kubectl drain [nodename] --delete-local-data --ignore-daemonsets

Then patch the server and reboot:

# yum upgrade -y
# shutdown -t 0 now -r

Finally bring the node back into the pool.

$ kubectl uncordon [nodename]

Update Versionlock Information

Currently the clusters have locked kubernetes to version 1.18.8, kubernetes-cni to version 0.8.6, and docker to 1.13.1-162. The locks on each server need to be removed and new locks put in place for the new versions of kubernetes, kubernetes-cni, and docker where appropriate.

Versionlock file location: /etc/yum/pluginconf.d/

Simply delete the existing locks:

/usr/bin/yum versionlock delete "kubelet.*"
/usr/bin/yum versionlock delete "kubectl.*"
/usr/bin/yum versionlock delete "kubeadm.*"
/usr/bin/yum versionlock delete "kubernetes-cni.*"
/usr/bin/yum versionlock delete "docker.*"
/usr/bin/yum versionlock delete "docker-common.*"
/usr/bin/yum versionlock delete "docker-client.*"
/usr/bin/yum versionlock delete "docker-rhel-push-plugin.*"

And then add in the new locks at the desired levels:

/usr/bin/yum versionlock add "kubelet-1.19.6-0.*"
/usr/bin/yum versionlock add "kubectl-1.19.6-0.*"
/usr/bin/yum versionlock add "kubeadm-1.19.6-0.*"
/usr/bin/yum versionlock "docker-1.13.1-203.*"
/usr/bin/yum versionlock "docker-common-1.13.1-203.*"
/usr/bin/yum versionlock "docker-client-1.13.1-203.*"
/usr/bin/yum versionlock "docker-rhel-push-plugin-1.13.1-203.*"
/usr/bin/yum versionlock "kubernetes-cni-0.8.7-0.*"

Then install the updated kubernetes and docker binaries. Note that the versionlocked versions and the installed version must match:

/usr/bin/yum install kubelet-1.19.6-0.x86_64
/usr/bin/yum install kubectl-1.19.6-0.x86_64
/usr/bin/yum install kubeadm-1.19.6-0.x86_64
/usr/bin/yum install docker-1.13.1-203.git0be3e21.el7_8.x86_64
/usr/bin/yum install docker-common-1.13.1-203.git0be3e21.el7*
/usr/bin/yum install docker-client-1.13.1-203.git0be3e21.el7*
/usr/bin/yum install docker-rhel-push-plugin-1.13.1-203.git0be3e21.el7*
/usr/bin/yum install kubernetes-cni-0.8.7-0.x86_64

Upgrade Kubernetes

Using the kubeadm command on the first master server, you can review the plan and then upgrade the cluster:

# kubeadm upgrade plan
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.18.8
[upgrade/versions] kubeadm version: v1.19.6
I1224 02:04:43.067987 8753 version.go:252] remote version is much newer: v1.20.1; falling back to: stable-1.19
[upgrade/versions] Latest stable version: v1.19.6
[upgrade/versions] Latest stable version: v1.19.6
[upgrade/versions] Latest version in the v1.18 series: v1.18.14
[upgrade/versions] Latest version in the v1.18 series: v1.18.14

Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT CURRENT AVAILABLE
kubelet 6 x v1.18.8 v1.18.14

Upgrade to the latest version in the v1.18 series:

COMPONENT CURRENT AVAILABLE
kube-apiserver v1.18.8 v1.18.14
kube-controller-manager v1.18.8 v1.18.14
kube-scheduler v1.18.8 v1.18.14
kube-proxy v1.18.8 v1.18.14
CoreDNS 1.6.7 1.7.0
etcd 3.4.3-0 3.4.3-0

You can now apply the upgrade by executing the following command:

kubeadm upgrade apply v1.18.14

_____________________________________________________________________

Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT CURRENT AVAILABLE
kubelet 6 x v1.18.8 v1.19.6

Upgrade to the latest stable version:

COMPONENT CURRENT AVAILABLE
kube-apiserver v1.18.8 v1.19.6
kube-controller-manager v1.18.8 v1.19.6
kube-scheduler v1.18.8 v1.19.6
kube-proxy v1.18.8 v1.19.6
CoreDNS 1.6.7 1.7.0
etcd 3.4.3-0 3.4.13-0

You can now apply the upgrade by executing the following command:

kubeadm upgrade apply v1.19.6

_____________________________________________________________________


The table below shows the current state of component configs as understood by this version of kubeadm.
Configs that have a "yes" mark in the "MANUAL UPGRADE REQUIRED" column require manual config upgrade or
resetting to kubeadm defaults before a successful upgrade can be performed. The version to manually
upgrade to is denoted in the "PREFERRED VERSION" column.

API GROUP CURRENT VERSION PREFERRED VERSION MANUAL UPGRADE REQUIRED
kubeproxy.config.k8s.io v1alpha1 v1alpha1 no
kubelet.config.k8s.io v1beta1 v1beta1 no
_____________________________________________________________________

There are likely newer versions of Kubernetes control plane containers available. In order to maintain consistency across all clusters, only upgrade the masters to 1.19.6:

# kubeadm upgrade apply 1.19.6
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -oyaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade/version] You have chosen to change the cluster version to "v1.19.6"
[upgrade/versions] Cluster version: v1.18.8
[upgrade/versions] kubeadm version: v1.19.6
[upgrade/confirm] Are you sure you want to proceed with the upgrade? [y/N]: y
[upgrade/prepull] Pulling images required for setting up a Kubernetes cluster
[upgrade/prepull] This might take a minute or two, depending on the speed of your internet connection
[upgrade/prepull] You can also perform this action in beforehand using 'kubeadm config images pull'
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.19.6"...
Static pod: kube-apiserver-bldr0cuomkube1.internal.pri hash: 053014e49eb31dd44a1951df85c466b0
Static pod: kube-controller-manager-bldr0cuomkube1.internal.pri hash: f23e1c90dbf9b2b0893cd8df7ee5d987
Static pod: kube-scheduler-bldr0cuomkube1.internal.pri hash: a3899df34b823393426e8f7ae39d8dee
[upgrade/etcd] Upgrading to TLS for etcd
Static pod: etcd-bldr0cuomkube1.internal.pri hash: 8d44a23a44041edc0180dec7c820610d
[upgrade/staticpods] Preparing for "etcd" upgrade
[upgrade/staticpods] Renewing etcd-server certificate
[upgrade/staticpods] Renewing etcd-peer certificate
[upgrade/staticpods] Renewing etcd-healthcheck-client certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/etcd.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2020-12-24-21-50-13/etcd.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: etcd-bldr0cuomkube1.internal.pri hash: 8d44a23a44041edc0180dec7c820610d
Static pod: etcd-bldr0cuomkube1.internal.pri hash: ab0e3948b56eb191236044c56350be62
[apiclient] Found 3 Pods for label selector component=etcd
[upgrade/staticpods] Component "etcd" upgraded successfully!
[upgrade/etcd] Waiting for etcd to become available
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests840688942"
[upgrade/staticpods] Preparing for "kube-apiserver" upgrade
[upgrade/staticpods] Renewing apiserver certificate
[upgrade/staticpods] Renewing apiserver-kubelet-client certificate
[upgrade/staticpods] Renewing front-proxy-client certificate
[upgrade/staticpods] Renewing apiserver-etcd-client certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-apiserver.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2020-12-24-21-50-13/kube-apiserver.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-apiserver-bldr0cuomkube1.internal.pri hash: 053014e49eb31dd44a1951df85c466b0
Static pod: kube-apiserver-bldr0cuomkube1.internal.pri hash: 4279fd8bec56cdea97ff8f8f7f5547d3
[apiclient] Found 3 Pods for label selector component=kube-apiserver
[upgrade/staticpods] Component "kube-apiserver" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-controller-manager" upgrade
[upgrade/staticpods] Renewing controller-manager.conf certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-controller-manager.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2020-12-24-21-50-13/kube-controller-manager.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-controller-manager-bldr0cuomkube1.internal.pri hash: f23e1c90dbf9b2b0893cd8df7ee5d987
Static pod: kube-controller-manager-bldr0cuomkube1.internal.pri hash: 202ee2ffdb77add9d9f3327e4fd827fc
[apiclient] Found 3 Pods for label selector component=kube-controller-manager
[upgrade/staticpods] Component "kube-controller-manager" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-scheduler" upgrade
[upgrade/staticpods] Renewing scheduler.conf certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-scheduler.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2020-12-24-21-50-13/kube-scheduler.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-scheduler-bldr0cuomkube1.internal.pri hash: a3899df34b823393426e8f7ae39d8dee
Static pod: kube-scheduler-bldr0cuomkube1.internal.pri hash: 5a568caf05a8bd40ae4b30cf4dcd90eb
[apiclient] Found 3 Pods for label selector component=kube-scheduler
[upgrade/staticpods] Component "kube-scheduler" upgraded successfully!
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.19" in namespace kube-system with the configuration for the kubelets in the cluster
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy

[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.19.6". Enjoy!

[upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.

Update Control Planes

On the second and third master, run the kubeadm upgrade apply 1.19.6 command and the control plane will be upgraded.

Update File and Directory Permissions

Verify the permissions match the table below once the upgrade is complete:

Path or Fileuser:groupPermissions
/etc/kubernetes/manifests/etcd.yaml root:root 0644
/etc/kubernetes/manifests/kube-apiserver.yaml 0644
/etc/kubernetes/manifests/kube-controller-manager.yaml root:root0644
/etc/kubernetes/manifests/kube-scheduler root:root 0644
/var/lib/etcd root:root 0700
/etc/kubernetes/admin.conf root:root 0644
/etc/kubernetes/scheduler.conf root:root 0644
/etc/kubernetes/controller-manager.conf root:root 0644
/etc/kubernetes/pki root:root 0755
/etc/kubernetes/pki/ca.crt root:root 0644
/etc/kubernetes/pki/apiserver.crt root:root 0644
/etc/kubernetes/pki/apiserver-kubelet-client.crt root:root 0644
/etc/kubernetes/pki/front-proxy-ca.crt root:root 0644
/etc/kubernetes/pki/front-proxy-client.crt root:root 0644
/etc/kubernetes/pki/sa.pub root:root 0644
/etc/kubernetes/pki/ca.key root:root 0600
/etc/kubernetes/pki/apiserver.key root:root 0600
/etc/kubernetes/pki/apiserver-kubelet-client.key root:root 0600
/etc/kubernetes/pki/front-proxy-ca.key root:root 0600
/etc/kubernetes/pki/front-proxy-client.key root:root 0600
/etc/kubernetes/pki/sa.key root:root 0600
/etc/kubernetes/pki/etcd root:root 0755
/etc/kubernetes/pki/etcd/ca.crt root:root 0644
/etc/kubernetes/pki/etcd/server.crt root:root 0644
/etc/kubernetes/pki/etcd/peer.crt root:root 0644
/etc/kubernetes/pki/etcd/healthcheck-client.crt root:root 0644
/etc/kubernetes/pki/etcd/ca.key root:root 0600
/etc/kubernetes/pki/etcd/server.key root:root 0600
/etc/kubernetes/pki/etcd/peer.key root:root 0600
/etc/kubernetes/pki/etcd/healthcheck-client.key root:root 0600

Update Manifests

During the kubeadm upgrade, the current control plane manifests are moved from /etc/kubernetes/manifests into /etc/kubernetes/tmp and new manifest files deployed. There are multiple settings and permissions that need to be reviewed and updated before the task is considered completed.

The kubeadm-config configmap has been updated to point to bldr0cuomrepo1.internal.pri:5000 however it and the various container configurationsshould be checked anyway. One of the issues is if it’s not updated or used, you’ll have to make the update manually including manually editing the kube-proxy daemonset configuration.

Note that when a manifest is updated, the associated image is reloaded. No need to manage the pods once manifests are updated.

etcd Manifest

Verify and update etcd.yaml

  • Change imagePullPolicy to Always.
  • Change image switching k8s.gcr.io with bldr0cuomrepo1.internal.pri:5000

kube-apiserver Manifest

Verify and update kube-apiserver.yaml

  • Add AlwaysPullImages and ResourceQuota admission controllers to the –enable-admission-plugins line
  • Change imagePullPolicy to Always
  • Change image switching k8s.gcr.io with bldr0cuomrepo1.internal.pri:5000

kube-controller-manager Manifest

Verify and update kube-controller-manager.yaml

  • Add ” – –cluster-name=kubecluster-[site]” after ” – –cluster-cidr=192.168.0.0/16″
  • Change imagePullPolicy to Always
  • Change image switching k8s.gcr.io to bldr0cuomrepo1.internal.pri:5000

kube-scheduler Manifest

Verify and update kube-scheduler.yaml

  • Change imagePullPolicy to Always
  • Change image switching k8s,gcr.io to bldr0cuomrepo1.internal.pri:5000

Update kube-proxy

You’ll need to edit the kube-proxy daemonset to change the imagePullPolicy. Check the image tag at the same time.

$ kubectl edit daemonset kube-proxy -n kube-system
  • Change imagePullPolicy to Always.
  • Change image switching k8s.gcr.io with bldr0cuomrepo1.internal.pri:5000

Save the changes.

Update coredns

You” need to edit the coredns deployment to change the imagePullPolizy. Check the image tag at the same time.

$ kubectl edit deployment coredns -n kube-system
  • Change imagePullPolicy to Always
  • Change image switching k8s.gcr.io to bldr0cuomrepo1.internal.pri:5000

Save the changes

Restart kubelet

Once done, kubelet and docker needs to be restarted on all nodes.

systemctl daemon-reload
systemctl restart kubelet
systemctl restart docker

Verify

Once kubelet has been restarted on all nodes, verify all nodes are at 1.18.8.

$ kubectl get nodes
NAME                           STATUS   ROLES    AGE   VERSION
bldr0cuomknode1.internal.pri   Ready    <none>   91d   v1.19.6
bldr0cuomknode2.internal.pri   Ready    <none>   91d   v1.19.6
bldr0cuomknode3.internal.pri   Ready    <none>   91d   v1.19.6
bldr0cuomkube1.internal.pri    Ready    master   91d   v1.19.6
bldr0cuomkube2.internal.pri    Ready    master   91d   v1.19.6
bldr0cuomkube3.internal.pri    Ready    master   91d   v1.19.6

Configuration Upgrades

Configuration files are on the tool servers (lnmt1cuomtool11) in the /usr/local/admin/playbooks/cschelin/kubernetes/configurations directory and the expectation is you’ll be in that directory when directed to apply configurations.

Calico Upgrade

In the calico directory, run the following command:

$ kubectl apply -f calico.yaml
configmap/calico-config unchanged
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/kubecontrollersconfigurations.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org configured
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers unchanged
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers unchanged
clusterrole.rbac.authorization.k8s.io/calico-node unchanged
clusterrolebinding.rbac.authorization.k8s.io/calico-node unchanged
daemonset.apps/calico-node configured
serviceaccount/calico-node unchanged
deployment.apps/calico-kube-controllers configured
serviceaccount/calico-kube-controllers unchanged

After calico is applied, the calico-kube-controllers pod will restart and then the calico-node pod restarts to retrieve the updated image.

Pull the calicoctl binary and copy it to /usr/local/bin, then verify the version. Note that this has likely already been done on the tool server. Verify it before pulling the binary.

$ curl -O -L  https://github.com/projectcalico/calicoctl/releases/download/v3.17.1/calicoctl

Verification

$ calicoctl version
Client Version:    v3.17.1
Git commit:        8871aca3
Cluster Version:   v3.17.1
Cluster Type:      k8s,bgp,kubeadm,kdd

Update CNI File Permissions

Verify the permissions of the files once the upgrade is complete.

Path or Fileuser:groupPermissions
/etc/cni/net.d/10-calico-conflist root:root0644
/etc/cni/net.d/calico-kubeconfig root:root0644

metrics-server Upgrade

In the metrics-server directory, run the following command:

$ kubectl apply -f components.yaml
serviceaccount/metrics-server created
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader created
clusterrole.rbac.authorization.k8s.io/system:metrics-server created
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader created
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator created
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server created
service/metrics-server created
deployment.apps/metrics-server created
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io created

Once the metrics-server deployment has been updated, the pod will restart.

kube-state-metrics Upgrade

As noted, this pod doesn’t need to be upgraded.

Filebeat Upgrade

Filebeat uses Elastic Stack clusters in four environments. Filebeat itself is installed on all clusters. Ensure you’re managing the correct cluster when upgrading the filebeat container as configurations are specific to each cluster.

Change to the appropriate cluster context directory and run the following command:

$ kubectl apply -f filebeat-kubernetes.yaml
configmap/filebeat-config created
daemonset.apps/filebeat created
clusterrolebinding.rbac.authorization.k8s.io/filebeat created
clusterrole.rbac.authorization.k8s.io/filebeat created
serviceaccount/filebeat created

Verification

Essentially monitor each cluster. You should see the filebeat containers restarting and returning to a Running state.

$ kubectl get pods -n monitoring -o wide

Posted in Computers, Kubernetes | Tagged | Leave a comment