LACP, Bonding, Bridges, and VLAN Tagging

Overview

This article provides some brief information on setting up bonded interfaces in Linux, configuring Link Aggregation Control Protocol (LACP), and then setting up tagged VLANs. Many times there’s an assumption of knowledge. This article provides some background and resources to verify the knowledge or increase it if some information is lacking. This is not intended, however, to be a detailed or deep dive into networking in Linux. Just to provide some information and configurations for the current requirement. There are other settings and options available. Feel free to continue reading on the topics and even take a class for even more information.

As always links to some of the pages I used for this article are in the references section at the end. If you find a better link, by all means, let me know and I’ll add it to the references.

Link Aggregation Control Protocol

We’ll start with LACP. We use LACP on a Bonded Interface in order to provide redundancy, load balancing, and to improve performance. A collection of LACP interfaces is called a Link Aggregation Group (LAG). Unlike other options, LACP collects all interfaces into a single, wider trunk for traffic. If we have 4 gigabit interfaces, we have a 4 gigabit trunk and not 4 individual 1 gigabit pipes. In my example below, I have 12 network interfaces as a LAG for bond0 so a 12 gigabit trunk.

mode

To enable LACP, set the mode to 802.3ad.

lacp_rate

This flag has two options, 0 and 1. If set to 0 (slow; default setting), the default LACP packet is sent to a peer (like the switch) every 30 seconds. If set to 1 (fast), the default LACP packet is sent to a peer every 1 second.

xmit_transfer_policy

There are three policy settings for this option.

  • layer2 (Data Link) – The default policy. Transmission hash is based on MAC. All network traffic is on a single slave interface in the bonded LACP.
  • layer2+3 (Data Link + Network) – Transmission hash is based on MAC and IP address. Sends traffic to destinations on the same slave interface. More balanced so provides the best performance and stability.
  • layer3+4 (Network + Transport) – Creates a transmission hash based on the upper network layer (Transport) whenever possible. It allows multiple traffic or connections spanning over multiple slave interfaces in a bonded LACP. One connection will not span over multiple interfaces. Reverts to layer2 (Data Link) for non IP traffic. Note that this isn’t fully compliant with LACP. This can be the best option for performance and stability but as noted, it’s not fully LACP compliant.

miimon

This is how often the bond checks the bonded interfaces to ensure they’re still active. A good value is 100 (milliseconds) which is the default if not set.

You can verify the setting by checking.

$ cat /sys/devices/virtual/net/bond0/bonding/miimon 100

Bonded Interfaces

Network Bonding is essentially combining multiple network interfaces into a single interface. There are several modes you can select to manage the behavior of the bonded interface. For our purposes, we’re using mode 4 which is the 802.3ad specification, Dynamic Link Aggregation.

Bonding Module

It’s very likely that whatever system you’re going to set up bonding on will be configured. But just to be sure, you’ll want to verify it’s available.

# modinfo bonding
filename:       /lib/modules/3.10.0-693.21.1.el7.x86_64/kernel/drivers/net/bonding/bonding.ko.xz
author:         Thomas Davis, tadavis@lbl.gov and many others
description:    Ethernet Channel Bonding Driver, v3.7.1
version:        3.7.1
license:        GPL
alias:          rtnl-link-bond
retpoline:      Y
rhelversion:    7.4
srcversion:     33C47E3D00DF16A17A5AB9C
depends:
intree:         Y
vermagic:       3.10.0-693.21.1.el7.x86_64 SMP mod_unload modversions
signer:         CentOS Linux kernel signing key
sig_key:        03:DA:60:92:F6:71:13:21:B5:AC:E1:2E:84:5D:A9:73:36:F7:67:4D
sig_hashalgo:   sha256
...

 If modinfo doesn’t return any information, you’ll want to activate bonding.

# modprobe --first-time bonding

Network Interfaces

The servers have a 4 port network card in addition to the four ports on the motherboard. I will be creating a 8 network interface bond.

  • eno5-eno8 – On Board
  • ens1f0-ens1f3 – Network Card

Configurations

For CentOS, multiple files are created in the /etc/sysconfig/network-scripts directory for the bond0 interface and each interface as listed above.

# vi /etc/sysconfig/network-scripts/ifcfg-bond0
DEVICE=bond0
NAME=bond0
TYPE=Bond
BONDING_MASTER=yes
IPADDR=192.168.1.10
PREFIX=24
ONBOOT=yes
BOOTPROTO=none
BONDING_OPTS="mode=802.3ad miimon=100 lacp_rate=fast xmit_hash_policy=layer2+3"

For CentOS, we’ll create a file for each interface. This is just the first file but each interface file will be the same except for the DEVICE and NAME entries.

# vi /etc/sysconfig/network-scripts/ifcfg-eno1
TYPE="Ethernet"
BOOTPROTO="None"
DEFROUTE="yes"
PEERDNS="yes"
PEERROUTES="yes"
IPV4_FAILURE_FATAL="no"
IPV4INIT="no"
DEVICE="eno1"
NAME="eno1"
ONBOOT="yes"
MASTER="bond0"
SLAVE="yes"

For Ubuntu which I’m using as my KVM server, you’ll edit the single /etc/netplan/01-netcfg.yaml file. Remember that spacing is critical when working in yaml. Initially you’ll disable dhcp4 for all interfaces. Then set up the bonding configuration.

$ cat 01-netcfg.yaml
network:
  version: 2
  renderer: networkd
  ethernets:
    eno1:
      dhcp4: no
    eno2:
      dhcp4: no
    eno3:
      dhcp4: no
    eno4:
      dhcp4: no
    enp8s0f0:
      dhcp4: no
    enp8s0f1:
      dhcp4: no
    enp9s0f0:
      dhcp4: no
    enp9s0f1:
      dhcp4: no
    enp12s0f0:
      dhcp4: no
    enp12s0f1:
      dhcp4: no
    enp13s0f0:
      dhcp4: no
    enp13s0f1:
      dhcp4: no

  bonds:
    bond0:
      interfaces:
      - eno1
      - eno2
      - enp8s0f0
      - enp8s0f1
      - enp12s0f0
      - enp12s0f1
      parameters:
        mode: 802.3ad
        transmit-hash-policy: layer3+4
        mii-monitor-interval: 100

Bond Status

You can check the bonding status by viewing the current configuration in /proc.

# cat /proc/net/bonding/bond0
Ethernet Channel Bonding Driver: v3.7.1 (April 27, 2011)

Bonding Mode: IEEE 802.3ad Dynamic link aggregation
Transmit Hash Policy: layer3+4 (1)
MII Status: up
MII Polling Interval (ms): 100
Up Delay (ms): 0
Down Delay (ms): 0
Peer Notification Delay (ms): 0

802.3ad info
LACP rate: slow
Min links: 0
Aggregator selection policy (ad_select): stable
System priority: 65535
System MAC address: be:70:a3:ab:10:6d
Active Aggregator Info:
        Aggregator ID: 1
        Number of ports: 1
        Actor Key: 9
        Partner Key: 1
        Partner Mac Address: 00:00:00:00:00:00

Slave Interface: enp13s0f1
MII Status: up
Speed: 1000 Mbps
Duplex: full
Link Failure Count: 0
Permanent HW addr: 00:1b:21:ca:12:45
Slave queue ID: 0
Aggregator ID: 1
Actor Churn State: none
Partner Churn State: churned
Actor Churned Count: 0
Partner Churned Count: 1
details actor lacp pdu:
    system priority: 65535
    system mac address: be:70:a3:ab:10:6d
    port key: 9
    port priority: 255
    port number: 1
    port state: 77
details partner lacp pdu:
    system priority: 65535
    system mac address: 00:00:00:00:00:00
    oper key: 1
    port priority: 255
    port number: 1
    port state: 1

...

Bridge Interfaces

A Bridge interface connects two networks together. Generally you’re connecting a VM network such as with KVM with the physical or bonded interface.

TYPE=Bridge
BOOTPROTO=none
DEFROUTE=yes
IPV4_FAILURE_FATAL=no
IPV6INIT=yes
IPV6_AUTOCONF=yes
IPV6_DEFROUTE=yes
IPV6_FAILURE_FATAL=no
NAME=br700
DEVICE=br700
ONBOOT=yes
IPADDR=192.168.1.10
PREFIX=24
GATEWAY=192.168.1.1
NM_CONTROLLED=no
USERCTL=no
DNS1=192.168.1.254
DOMAIN=internal.pri

You can get some bridge information using the brctl command.

# brctl show
bridge name	bridge id		STP enabled	interfaces
br2719		8000.98f2b3288d26	no		bond2.719
							vnet1
br2751		8000.98f2b3288d26	no		bond2.751
							vnet5
br2752		8000.98f2b3288d26	no		bond2.752
							vnet6
br700		8000.98f2b3288d24	no		bond0.700
							vnet0
							vnet11
							vnet13
							vnet2
							vnet20
							vnet21
							vnet3
							vnet7
							vnet8
br710		8000.98f2b3288d25	no		bond1.710
							vnet10
							vnet12
							vnet19
							vnet22
							vnet4
							vnet9
br750		8000.98f2b3288d24	no		bond0.750
virbr0		8000.525400085dad	yes		virbr0-nic
virbr1		8000.525400a5a4ea	yes		virbr1-nic

When using nmcli to create a bridge, you can connect an interface that doesn’t have a connection profile or connect an interface with an existing connection profile. A connection profile is an interface that was created by Network Manager.

First create the bridge interface.

# nmcli con add type bridge con-name br810 ifname br810

For an interface without a connection profile.

# nmcli con add type ethernet slave-type bridge con-name br810-eno5 \
    ifname eno5 master br810
# nmcli con add type ethernet slave-type bridge con-name br810-eno6 \
    ifname eno6 master br810

However if we already have a connection profile, you just designate the bridge as the master.

# nmcli con mod bond0.810 master br810

For Ubuntu systems, you’ll again edit the /etc/netplan/01-netcfg.yaml file to add the bridge information.

bridges:
    br0:
      dhcp4: false
      addresses:
        - 192.168.1.10/24
      gateway4: 192.168.1.254
      nameservers:
        addresses:
          - 192.168.1.254
        search: ['internal.pri schelin.org']
      interfaces:
        - bond0

VLAN Tagging

VLAN Tagging simply lets you tag traffic associated with the underlying interface so that the switch knows where traffic needs to be sent.

Configuration

This will describe setting up one of the Dell servers. This is the interface hierarchy. At the top level is the bridged interface followed by the bonded and then physical interfaces.

  • br810 – Bridged interface. This has the IP Assignments.
    • bond0.810 – Bonded VLAN tagged interface, references br810.
      • bond0 – The underlying bonded interface. This has the BONDED_OPTS line.
        • eno5 – The lower physical interface and references bond0.
        • eno6 – The lower physical interface.
        • eno7 – The lower physical interface.
        • etc…

Here is the list of IP addresses. This is for a dual interface system which my Dell is. For the Ubuntu based Dell box, it’s a single interface as described above. The only real change when viewing the examples below should be the IP address for each system.

Domain is internal.pri.

Current Server NameNew Server NameApp/IP AddressGatewayManagement/IP AddressGateway
morganniki192.168.1.10/24192.168.1.110.100.78.10010.100.78.254

The production Dell system has 8 interfaces. eno5-eno8 and ens1f0-ens1f3. We’ll set up the bridge, the bonded vlan tagged interface, the bonded interface, and then connect all the interfaces to the bond.

Since this is via the console, the example output below will not have all the information.

The configuration process is to create the bonded interface first. Then add the underlying physical interfaces. Next add a bridged interface. And finally configure the VLAN for the tagged interface.

# nmcli con add type bond con-name bond0 ifname bond0 \
    bond.options "mode=802.3ad,miimon=100,lacp_rate=fast,xmit_hash_policy=layer2+3" \
    ipv4.method disabled ipv6.method ignore

The bond0 interface doesn’t get an IP address, the bridge interface does. Hence the ipv4.method disabled option.

Next add the physical interfaces to the bond. Since they already have connection profiles, we don’t need to further configure the interfaces.

# nmcli con add type bond-slave ifname eno5 master bond0
# nmcli con add type bond-slave ifname eno6 master bond0
# nmcli con add type bond-slave ifname eno7 master bond0
# nmcli con add type bond-slave ifname eno8 master bond0
# nmcli con add type bond-slave ifname ens1f0 master bond0
# nmcli con add type bond-slave ifname ens1f1 master bond0
# nmcli con add type bond-slave ifname ens1f2 master bond0
# nmcli con add type bond-slave ifname ens1f3 master bond0

Now we configure a bridge interface. Make sure you change the IP address.

# nmcli con add type bridge con-name br810 ifname br810 ip4 192.168.1.10/24

We’ll need to add the gateway as well. And change the gateway address too.

# nmcli con mod br810 ipv4.method manual ipv4.gateway 192.168.1.1

Finally add the VLAN interface on top of the bond0 interface but with the bridge interface as the master.

# nmcli con add type vlan con-name bond0.810 ifname bond0.810 dev bond0 \
    vlan.id 810 master br810 slave-type bridge

That should bring the interface up. You can review the status by:

# more /proc/net/bonding/bond0

As the network is a service network, it’s not reachable so a second bridge and vlan has been configured. Make sure you change the IP address.

# nmcli con add type bridge con-name br700 ifname br700 ip4 10.100.78.100/24

Add the gateway. And change the gateway address too.

# nmcli con mod br700 ipv4.method manual ipv4.gateway 10.100.78.254

And the VLAN.

# nmcli con add type vlan con-name bond0.700 ifname bond0.700 dev bond0 \
    vlan.id 700 master br700 slave-type bridge

Finally a route is needed to ensure we can access the servers. Make sure you change that last IP to the gateway for the Management IP address.

# nmcli con mod br700 +ipv4.routes "192.168.0.0/16 192.168.1.1"

And remove the default route for br700. This changes the ‘DEFROUTE=yes’ option to ‘DEFROUTE=no’ and removes the gateway if set (as in above).

# nmcli con mod br700 ipv4.never-default yes

References

Posted in Computers, KVM | Leave a comment

Ansible Patterns

Overview

One of the cool things with Ansible are Patterns. With Patterns, you can combine entries in your inventory file. Instead of having a bunch of tags specific to each site, you can instead have a list of servers in a site and a list of server types and combine it with a Pattern to only run the playbook against the servers that are in both tag groups.

Example Run

If you have an inventory file where you identify all hosts in an environment, such as the Production environment. You might have a tag of longmont.

In addition, if we have different types of servers such as DNS or even Openshift, you might have a tag of openshift that encompasses all openshift servers across all environments. You could even break it down a little further, such as worker.

These tags will have Boulder, Cabo, Tatooine, and Longmont servers all under one tag.

But you don’t want to run this playbook against every openshift worker server. Just against the ones in the staging environment. The staging environment in Tatooine in fact.

ansible-playbook editeth1.yaml -i inventory --ask-pass -e tag='tatooine:&openshift:&worker'

PLAY [tatooine:&openshift:&worker] *********************************************************************************************************************************************************************

TASK [Gathering Facts] **********************************************************************************************************************************************************************************
ok: [tato0cuomwrk01]
ok: [tato0cuomwrk02]
ok: [tato0cuomwrk03]
ok: [tato0cuomwrk04]

How nice is that?

Options

There are other patterns that can be used to manage the inventory plus they can be combined.

  • [tag]:&[tag] – Only servers that are in both lists.
  • [tag]:![tag] – Only servers that are NOT in the second list.
  • [tag]:[tag] – A combined list of all servers that are in both lists.

References

Posted in ansible, Computers | Tagged , | Leave a comment

Kubernetes Upgrade to 1.20.6

Upgrading Kubernetes Clusters

The following lists what software and pods will be upgraded during this quarter.

  • Upgrade the Operating System.
  • Upgrade Kubernetes.
    • Upgrade kubeadm, kubectl, and kubelet RPMs from 1.19.6 to 1.20.6.
    • kube-apiserver is upgraded from 1.19.6 to 1.20.6 automatically.
    • kube-controller-manager is upgraded from 1.19.6 to 1.20.6 automatically.
    • kube-scheduler is upgraded from 1.19.6 to 1.20.6 automatically.
    • kube-proxy is upgraded from 1.19.6 to 1.20.6 automatically.
    • pause is upgraded from 3.2 to 3.4.1
  • Upgrade docker from 1.13.1-203 to 1.13.1-204.
  • Upgrade Calico from 3.17.1 to 3.18.2.
  • Upgrade Filebeat from 7.10.0 to 7.12.1
  • metrics-server is upgraded from 0.4.1 to 0.4.3.
  • kube-state-metrics is upgraded from 1.9.7 to 2.0.0.

Unchanged Products

The following products do not have an upgrade this quarter.

  • kubernetes-cni remains at 0.8.7-0.
  • coredns remains at 1.7.0.
  • etcd remains at 3.4.13-0.

Upgrade Notes

The following notes provide information on what changes might be affecting users of the clusters when upgrading from one version to the next. The notes I’m adding reflect what I think relevant to our environment so no discussions on Azure although I might call it out briefly. For more details, click on the provided links. If you find something you think relevant, please let me know and I’ll add it in.

Kubernetes Core

The following notes will reflect changes that might be relevant between the currently installed 1.19.6 up through 1.20.6, the target upgrade for Q2. While I’m trying to make sure I don’t miss something, the checks are for my specific environment. If you’re not sure, check the links to see if any changes apply to your product/project. Reminder that many of the 1.19 updates are the same as the 1.20 updates. As 1.20 is updated and patched, similar 1.19 releases address the same patches.

  • 1.19.7 – CPUmanager bug fix and cadvisor metrics fix.
  • 1.19.8 – Avoid marking a node as ready before it validates all API calls at least once. Static pods are deleted gracefully.
  • 1.19.9 – Counting a pods overhead resource usage as part of the ResourceQuota.
  • 1.19.10 – Nothing relevant to my environment.
  • 1.20.0 – The biggest is dockershim being deprecated and replaced with containerd. The new API Priority and Fairness configurations are in beta. This lets you prevent an overflow of API Server requests which might impact the API Server.
  • 1.20.1
  • 1.20.2
  • 1.20.3
  • 1.20.4
  • 1.20.5
  • 1.20.6

Calico

The major release notes are on a single page. Versions noted here to describe the upgrade for each version. For example, 3.17.2 through 3.17.4 all point to the 3.17 Release Notes. Here I’m describing the changes, if relevant, between the point releases.

Note that we’re not currently using many of the features of Calico yet so improvements, changes, and fixes for Calico issues aren’t likely to impact any current services.

Filebeat

docker

Run rpm -q --changelog docker
  • 1.13.1-204 –

kube-state-metrics

metrics-server

References

Posted in Computers, Kubernetes | Tagged , | Leave a comment

Kubernetes Preparation Steps for 1.20.6

Upgrading Kubernetes Clusters

The purpose of this document is to provide the background information on what is being upgraded, what versions, and the steps required to prepare for the upgrade itself. These steps are only done once. Once all these steps have been completed and all the configurations checked into github and gitlab, all clusters are then ready to be upgraded.

Reference links to product documentation at the end of this document.

Upgrade Preparation Steps

Upgrades to the Sandbox environment are done a few weeks before the official release for more in depth testing. Checking the release docs, changelog, and general operational status for the various tools that are in use.

Server Preparations

With the possibility of an upgrade to Spacewalk and to ensure the necessary software is installed prior to the upgrade, make sure all repositories are enabled and that the yum-plugin-versionlock software is installed.

Enable Repositories

Check the Spacewalk configuration and ensure that upgrades are coming from the local server and not from the internet.

Install yum versionlock

The critical components of Kubernetes are locked into place using the versionlock yum plugin. If not already installed, install it before beginning work.

# yum install yum-plugin-versionlock -y

Load Images

Next step is to load all the necessary Kubernetes, etcd, and additional images like coredns to the local repository so that all the clusters aren’t pulling all images from the internet. As a note, pause:3.1 has been upgraded to pause:3.2. Make sure you pull and update the image.

# docker pull k8s.gcr.io/kube-apiserver:v1.20.6
v1.20.6: Pulling from kube-apiserver
d94d38b8f0e6: Pull complete
6ee16ead6dee: Pull complete
ee5e6c27aaae: Pull complete
Digest: sha256:e6d960baa4219fa810ee26da8fe8a92a1cf9dae83b6ad8bda0e17ee159c68501
Status: Downloaded newer image for k8s.gcr.io/kube-apiserver:v1.20.6
k8s.gcr.io/kube-apiserver:v1.20.6
 
# docker pull k8s.gcr.io/kube-controller-manager:v1.20.6
v1.20.6: Pulling from kube-controller-manager
d94d38b8f0e6: Already exists
6ee16ead6dee: Already exists
a484c6338761: Pull complete
Digest: sha256:a1a6e8dbcf0294175df5f248503c8792b3770c53535670e44a7724718fc93e87
Status: Downloaded newer image for k8s.gcr.io/kube-controller-manager:v1.20.6
k8s.gcr.io/kube-controller-manager:v1.20.6
 
# docker pull k8s.gcr.io/kube-scheduler:v1.20.6
v1.20.6: Pulling from kube-scheduler
d94d38b8f0e6: Already exists
6ee16ead6dee: Already exists
1db6741b5f3c: Pull complete
Digest: sha256:ebb0350893fcfe7328140452f8a88ce682ec6f00337015a055d51b3fe0373429
Status: Downloaded newer image for k8s.gcr.io/kube-scheduler:v1.20.6
k8s.gcr.io/kube-scheduler:v1.20.6
 
# docker pull k8s.gcr.io/kube-proxy:v1.20.6
v1.20.6: Pulling from kube-proxy
e5a8c1ed6cf1: Pull complete
f275df365c13: Pull complete
6a2802bb94f4: Pull complete
cb3853c52da4: Pull complete
db342cbe4b1c: Pull complete
9a72dd095a53: Pull complete
a6a3a90a2713: Pull complete
Digest: sha256:7c1710c965f55bca8d06ebd8d5774ecd9ef924f33fb024e424c2b9b565f477dc
Status: Downloaded newer image for k8s.gcr.io/kube-proxy:v1.20.6
k8s.gcr.io/kube-proxy:v1.20.6
 
# docker pull k8s.gcr.io/pause:3.4.1
3.4.1: Pulling from pause
fac425775c9d: Pull complete
Digest: sha256:6c3835cab3980f11b83277305d0d736051c32b17606f5ec59f1dda67c9ba3810
Status: Downloaded newer image for k8s.gcr.io/pause:3.4.1
k8s.gcr.io/pause:3.4.1
 
# docker image ls
REPOSITORY                                                 TAG           IMAGE ID       CREATED         SIZE
k8s.gcr.io/kube-proxy                                      v1.20.6       9a1ebfd8124d   12 days ago     118MB
k8s.gcr.io/kube-scheduler                                  v1.20.6       b93ab2ec4475   12 days ago     47.2MB
k8s.gcr.io/kube-controller-manager                         v1.20.6       560dd11d4550   12 days ago     116MB
k8s.gcr.io/kube-apiserver                                  v1.20.6       b05d611c1af9   12 days ago     122MB
k8s.gcr.io/pause                                           3.4.1         0f8457a4c2ec   3 months ago    683kB

Next up is to tag all the images so they’ll be hosted locally on the bldr0cuomrepo1.internal.pri server.

# docker tag k8s.gcr.io/kube-apiserver:v1.20.6          bldr0cuomrepo1.internal.pri:5000/kube-apiserver:v1.20.6
# docker tag k8s.gcr.io/kube-controller-manager:v1.20.6 bldr0cuomrepo1.internal.pri:5000/kube-controller-manager:v1.20.6
# docker tag k8s.gcr.io/kube-scheduler:v1.20.6          bldr0cuomrepo1.internal.pri:5000/kube-scheduler:v1.20.6
# docker tag k8s.gcr.io/kube-proxy:v1.20.6              bldr0cuomrepo1.internal.pri:5000/kube-proxy:v1.20.6
# docker tag k8s.gcr.io/pause:3.4.1                     bldr0cuomrepo1.internal.pri:5000/pause:3.4.1
 
# docker image ls
REPOSITORY                                                 TAG           IMAGE ID       CREATED         SIZE
bldr0cuomrepo1.internal.pri:5000/kube-proxy                v1.20.6       9a1ebfd8124d   12 days ago     118MB
k8s.gcr.io/kube-proxy                                      v1.20.6       9a1ebfd8124d   12 days ago     118MB
bldr0cuomrepo1.internal.pri:5000/kube-controller-manager   v1.20.6       560dd11d4550   12 days ago     116MB
k8s.gcr.io/kube-controller-manager                         v1.20.6       560dd11d4550   12 days ago     116MB
k8s.gcr.io/kube-scheduler                                  v1.20.6       b93ab2ec4475   12 days ago     47.2MB
bldr0cuomrepo1.internal.pri:5000/kube-scheduler            v1.20.6       b93ab2ec4475   12 days ago     47.2MB
k8s.gcr.io/kube-apiserver                                  v1.20.6       b05d611c1af9   12 days ago     122MB
bldr0cuomrepo1.internal.pri:5000/kube-apiserver            v1.20.6       b05d611c1af9   12 days ago     122MB
bldr0cuomrepo1.internal.pri:5000/pause                     3.4.1         0f8457a4c2ec   3 months ago    683kB
k8s.gcr.io/pause                                           3.4.1         0f8457a4c2ec   3 months ago    683kB

The final step is to push them all up to the local repository.

# docker push bldr0cuomrepo1.internal.pri:5000/kube-apiserver:v1.20.6
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/kube-apiserver]
d88bc16e0414: Pushed
a06ec64d2560: Pushed
28699c71935f: Pushed
v1.20.6: digest: sha256:d21627934fb7546255475a7ab4472ebc1ae7952cc7ee31509ee630376c3eea03 size: 949
 
# docker push bldr0cuomrepo1.internal.pri:5000/kube-controller-manager:v1.20.6
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/kube-controller-manager]
1387661b583c: Pushed
a06ec64d2560: Mounted from kube-apiserver
28699c71935f: Mounted from kube-apiserver
v1.20.6: digest: sha256:ca13f2bf278e3157d75fd08a369390b98f976c6af502d4579a9ab62b97248b5b size: 949
 
# docker push bldr0cuomrepo1.internal.pri:5000/kube-scheduler:v1.20.6
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/kube-scheduler]
f17938017a0a: Pushed
a06ec64d2560: Mounted from kube-controller-manager
28699c71935f: Mounted from kube-controller-manager
v1.20.6: digest: sha256:eee174e9eb4499f31bfb10d0350de87ea90431f949716cc4af1b5c899aab2058 size: 949
 
# docker push bldr0cuomrepo1.internal.pri:5000/kube-proxy:v1.20.6
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/kube-proxy]
0c96004b5be1: Pushed
94812b0f02ce: Pushed
3a90582021f9: Pushed
f6be8a0f65af: Pushed
2b046f2c8708: Pushed
6ee930b14c6f: Pushed
f00bc8568f7b: Pushed
v1.20.6: digest: sha256:1689b5ac14d4d6e202a6752573818ce952e0bd3359b6210707b8b2031fedaa4d size: 1786
 
# docker push bldr0cuomrepo1.internal.pri:5000/pause:3.4.1
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/pause]
915e8870f7d1: Pushed
3.4.1: digest: sha256:9ec1e780f5c0196af7b28f135ffc0533eddcb0a54a0ba8b32943303ce76fe70d size: 526

Software Preparations

This section describes the updates that need to be made to the various containers that are installed in the Kubernetes clusters. Most of the changes involve updating the location to point to my Docker Repository vs pulling directly from the Internet.

You’ll need to clone if new, or pull the current playbook repo from gitlab as all the work will be done in various directories under the kubernetes/configurations directory. You’ll want to do that before continuing. All subsequent sections assume you’re in the kubernetes/configurations directory.

$ git clone git@lnmt1cuomgitlab.internal.pri:external-unix/playbooks.git
$ git pull git@lnmt1cuomgitlab.internal.pri:external-unix/playbooks.git

Make sure you add and commit the changes to your repo.

$ git add [file]
$ git commit [file] -m "commit comment"

And once done with all the updates, push the changes back up to gitlab.

$ git push

Update calico.yaml

In the calico directory run the following command to get the current calico.yaml file:

$ curl https://docs.projectcalico.org/manifests/calico.yaml -O

Basically grep out the image lines and pull the new images down to the local repository in order to retrieve the images locally.

# docker pull docker.io/calico/cni:v3.18.2
v3.18.2: Pulling from calico/cni
69606a78e084: Pull complete
85f85638f4b8: Pull complete
70ce15fa0c8a: Pull complete
Digest: sha256:664e1667fae09516a170ddd86e1a9c3bd021442f1e1c1fad19ce33d5b68bb58e
Status: Downloaded newer image for calico/cni:v3.18.2
docker.io/calico/cni:v3.18.2
 
# docker pull docker.io/calico/pod2daemon-flexvol:v3.18.2
v3.18.2: Pulling from calico/pod2daemon-flexvol
a5a0edbd6170: Pull complete
b10b71798d0d: Pull complete
5c3c4f282980: Pull complete
052e1842c6c3: Pull complete
6f392ce4dbcf: Pull complete
bc1f9a256ba0: Pull complete
fa4be31a19e9: Pull complete
Digest: sha256:7808a18ac025d3b154a9ddb7ca6439565d0af52a37e166cb1a14dcdb20caed67
Status: Downloaded newer image for calico/pod2daemon-flexvol:v3.18.2
docker.io/calico/pod2daemon-flexvol:v3.18.2
 
# docker pull docker.io/calico/node:v3.18.2
v3.18.2: Pulling from calico/node
2aee75817f4e: Pull complete
e1c64009c125: Pull complete
Digest: sha256:c598c6d5f43080f4696af03dd8784ad861b40c718ffbba5536b14dbf3b2349af
Status: Downloaded newer image for calico/node:v3.18.2
docker.io/calico/node:v3.18.2
 
# docker pull docker.io/calico/kube-controllers:v3.18.2
v3.18.2: Pulling from calico/kube-controllers
94ca07728981: Pull complete
c86a87d48320: Pull complete
f257a15e509c: Pull complete
8aad47abc588: Pull complete
Digest: sha256:ae544f188f2bd9d2fcd4b1f2b9a031c903ccaff8430737d6555833a81f4824d1
Status: Downloaded newer image for calico/kube-controllers:v3.18.2
docker.io/calico/kube-controllers:v3.18.2

Then tag the images for local storage.

# docker tag calico/cni:v3.18.2                bldr0cuomrepo1.internal.pri:5000/cni:v3.18.2
# docker tag calico/pod2daemon-flexvol:v3.18.2 bldr0cuomrepo1.internal.pri:5000/pod2daemon-flexvol:v3.18.2
# docker tag calico/node:v3.18.2               bldr0cuomrepo1.internal.pri:5000/node:v3.18.2
# docker tag calico/kube-controllers:v3.18.2   bldr0cuomrepo1.internal.pri:5000/kube-controllers:v3.18.2

Then push them up to the local repository.

# docker push bldr0cuomrepo1.internal.pri:5000/cni:v3.18.2
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/cni]
145c410196dc: Pushed
aec93328a278: Pushed
fd6f5b9d2ec9: Pushed
v3.18.2: digest: sha256:42ffea5056c9b61783423e16390869cdc16a8797eb9231cf7c747fe70371dfef size: 946
 
# docker push bldr0cuomrepo1.internal.pri:5000/pod2daemon-flexvol:v3.18.2
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/pod2daemon-flexvol]
125832445a60: Pushed
682e2fee7907: Pushed
12f496e83a60: Pushed
45acaaeabd00: Pushed
427dd33e9f20: Pushed
76ecd8aaf249: Pushed
63c82d5fed4a: Pushed
v3.18.2: digest: sha256:f243b72138e8e1d0e6399d000c03f38a052f54234f3d3b8a292f3c868a51ab07 size: 1788
 
# docker push bldr0cuomrepo1.internal.pri:5000/node:v3.18.2
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/node]
7c3bf8ac29b3: Pushed
534f69678b53: Pushed
v3.18.2: digest: sha256:d51436d6da50afc73d9de086aa03f7abd6938ecf2a838666a0e5ccb8dee25087 size: 737
 
# docker push bldr0cuomrepo1.internal.pri:5000/kube-controllers:v3.18.2
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/kube-controllers]
5d1855397d0b: Pushed
4769d3354700: Pushed
4ea3707886e0: Pushed
054ba5c2f771: Pushed
v3.18.2: digest: sha256:d8d2c4a98bbdbfd19fe2e4cc9492552852a9d11628e338142b1d1268b51593ce size: 1155

Edit the file, search for image: and insert in front of the images, the image path:

bldr0cuomrepo1.internal.pri:5000

Make sure you follow the documentation to update calicoctl to 3.18.2 as well.

Update metrics-server

In the metrics-server directory, back up the existing components.yaml file and run the following command to get the current components.yaml file:

$ wget https://github.com/kubernetes-sigs/metrics-server/releases/download/v0.4.1/components.yaml

Run a diff against the two files to see what might have changed. Then edit the file, search for image: and replace k8s.gcr.io with bldr0cuomrepo1.internal.pri:5000.

Download the new image and save it locally.

# docker pull k8s.gcr.io/metrics-server/metrics-server:v0.4.3
v0.4.3: Pulling from metrics-server/metrics-server
5dea5ec2316d: Pull complete
ef7ee42a1880: Pull complete
Digest: sha256:eb6b6153494087bde59ceb14e68280f1fbdd17cfff2efc3a68e30a1adfa8807d
Status: Downloaded newer image for k8s.gcr.io/metrics-server/metrics-server:v0.4.3
k8s.gcr.io/metrics-server/metrics-server:v0.4.3

Tag the image.

# docker tag k8s.gcr.io/metrics-server/metrics-server:v0.4.3 bldr0cuomrepo1.internal.pri:5000/metrics-server:v0.4.3

And push the newly tagged image.

# docker push bldr0cuomrepo1.internal.pri:5000/metrics-server:v0.4.3
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/metrics-server]
abc161b95845: Pushed
417cb9b79ade: Pushed
v0.4.3: digest: sha256:2b6814cb0b058b753cb6cdfe906493a8128fabb03d405f60024a47ab49ddaa09 size: 739

Update kube-state-metrics

Updating kube-state-metrics is a bit more involved as there are several files that are part of the distribution however you only need a small subset. You’ll need to clone or pull the kube-state-metrics repo.

$ git clone https://github.com/kubernetes/kube-state-metrics.git

Once you have the repo, in the kube-state-metrics/examples/standard directory, copy all the files into the playbooks kube-state-metrics directory.

Edit the deployment.yaml file, search for image: and replace quay.io with bldr0cuomrepo1.internal.pri:5000

After you’ve updated the files, download the image:

# docker pull k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.0.0
v2.0.0: Pulling from kube-state-metrics/kube-state-metrics
5dea5ec2316d: Already exists
2c0aab77c223: Pull complete
Digest: sha256:eb2f41024a583e8795213726099c6f9432f2d64ab3754cc8ab8d00bdbc328910
Status: Downloaded newer image for k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.0.0
k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.0.0

Tag the image.

# docker tag k8s.gcr.io/kube-state-metrics/kube-state-metrics:v2.0.0 bldr0cuomrepo1.internal.pri:5000/kube-state-metrics:v2.0.0

And push the newly tagged image.

# docker push bldr0cuomrepo1.internal.pri:5000/kube-state-metrics:v2.0.0
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/kube-state-metrics]
d2bc11882435: Pushed
417cb9b79ade: Mounted from metrics-server
v2.0.0: digest: sha256:ee13833414a49b0d2370e8edff5844eba96630cda80cfcd37c444bf88522cc51 size: 738

Update filebeat-kubernetes.yaml

In the filebeat directory, run the following command to get the current filebeat-kubernetes.yaml file:

curl -L -O https://raw.githubusercontent.com/elastic/beats/7.12/deploy/kubernetes/filebeat-kubernetes.yaml

Change all references in the filebeat-kubernetes.yaml file from kube-system to monitoring. If a new installation, create the monitoring namespace.

Update the local repository with the new docker image.

# docker pull docker.elastic.co/beats/filebeat:7.12.1
7.12.1: Pulling from beats/filebeat
a4f595742a5b: Pull complete
f7bc9401458a: Pull complete
ce7f9b59a9d3: Pull complete
e0ba09632c1a: Pull complete
3a0a0a9a5b5f: Pull complete
4f7abff72235: Pull complete
8cf479d85574: Pull complete
3b62c2ebd4b6: Pull complete
79a6ebf558dc: Pull complete
0c22790a6b07: Pull complete
dfd98a660972: Pull complete
Digest: sha256:e9558ca6e2df72a7933d4f175d85e8cf352da08bc32d97943bb844745d4a063a
Status: Downloaded newer image for docker.elastic.co/beats/filebeat:7.12.1
docker.elastic.co/beats/filebeat:7.12.1

Tag the image appropriately.

# docker tag docker.elastic.co/beats/filebeat:7.12.1 bldr0cuomrepo1.internal.pri:5000/filebeat:7.12.1

Finally, push it up to the local repository.

# docker push bldr0cuomrepo1.internal.pri:5000/filebeat:7.12.1
The push refers to repository [bldr0cuomrepo1.internal.pri:5000/filebeat]
446d15d628e2: Pushed
19bc11b9258e: Pushed
8ee55e79c98f: Pushed
851de8b3f92f: Pushed
eacdcb47588f: Pushed
bc27d098296e: Pushed
9c4f2da5ee8b: Pushed
2c278752a013: Pushed
bd82c7b8fd60: Pushed
f9b1f5eda8ab: Pushed
174f56854903: Pushed
7.12.1: digest: sha256:02a034166c71785f5c2d1787cc607994f68aa0521734d11da91f8fbd0cfdc640 size: 2616

Once the image is hosted locally, copy the file into each of the cluster directories and make the following changes.

DaemonSet Changes

In the filebeat folder are two files. A config file and an update file. These files automatically make changes to the filebeat-kubernetes.yaml file based on some of the changes that are performed below. The below changes are made to prepare for the script which populates the different clusters with correct information.

  • Switches the docker.elastic.co/beats image with bldr0cuomrepo1.internal.pri:5000
  • Replaces <elasticsearch> with the actual ELK Master server name
  • Switches the kube-system namespace with monitoring. You’ll need to ensure the monitoring namespace has been created before applying this .yaml file.
  • Replaces DEPLOY_ENV with the expected deployment environment name; dev, sqa, staging, or prod. These names are used in the ELK cluster to easily identify where the logs are sourced.

In order for the script to work, change the values in the following lines to match:

        - name: ELASTICSEARCH_HOST
          value: "<elasticsearch>"
        - name: ELASTICSEARCH_PORT
          value: "9200"
        - name: ELASTICSEARCH_USERNAME
          value: ""
        - name: ELASTICSEARCH_PASSWORD
          value: ""

In addition, remove the following lines. They confuse the container if they exist.

        - name: ELASTIC_CLOUD_ID
          value:
        - name: ELASTIC_CLOUD_AUTH
          value:

Add the default username and password to the following lines as noted:

    output.elasticsearch:
      hosts: ['${ELASTICSEARCH_HOST:elasticsearch}:${ELASTICSEARCH_PORT:9200}']
      username: ${ELASTICSEARCH_USERNAME:elastic}
      password: ${ELASTICSEARCH_PASSWORD:changeme}
ConfigMap Changes

In the ConfigMap section, activate the filebeat.autodiscover section by uncommenting it and delete the filebeat.inputs configuration section. In the filebeat.autodiscover section make the following three changes:

filebeat.autodiscover:
  providers:
    - type: kubernetes
      host: ${NODE_NAME}                          # rename node to host
      hints.enabled: true
      hints.default_config.enabled: false         # add this line
      hints.default_config:
        type: container
        paths:
          - /var/log/containers/*${data.kubernetes.container.id}.log
        exclude_lines: ["^\\s+[\\-`('.|_]"]  # drop asciiart lines  # add this line

In the processors section, remove the cloud.id and cloud.auth lines, add the following lines, and change DEPLOY_ENV to the environment filebeat is being deployed to: dev, sqa, staging, or prod. Indentation is important!

processors:
- add_cloud_metadata:
- add_host_metadata:
- add_fields:                             # add these 4 lines. pay attention to indentation!
target: ''
fields:
environment: 'DEPLOY_ENV'
Elastic Stack in Development

This Elastic Stack cluster is used by the Development Kubernetes clusters. Update the files in the bldr0-0 directory.

- name: ELASTICSEARCH_HOST
  value: bldr0cuomifem1.internal.pri
Elastic Stack in QA

This Elastic Stack cluster is used by the QA Kubernetes clusters. Update the files in the cabo0-0 directory.

- name: ELASTICSEARCH_HOST
  value: cabo0cuomifem1.internal.pri
Elastic Stack in Staging

This Elastic Stack cluster is used by the Staging Kubernetes clusters. Update the files in the tato0-1 directory.

- name: ELASTICSEARCH_HOST
  value: tato0cuomifem1.internal.pri
Elastic Stack in Production

This Elastic Stack cluster is used by the Production Kubernetes cluster. Update the file in the lnmt1-2 directory.

- name: ELASTICSEARCH_HOST
  value: lnmt1cuelkmstr1.internal.pri
Posted in Computers, Kubernetes | Tagged , | Leave a comment

Kubernetes Manual Upgrade to 1.20.6

Upgrading Kubernetes Clusters

This documentation is intended to provide the manual process for upgrading the server Operating Systems, Kubernetes to 1.20.6, and any additional updates. This provides example output and should help in troubleshooting should the automated processes experience a problem.

All of the steps required to prepare for an installation should be completed prior to starting this process.

Server and Kubernetes Upgrades

Patch Servers

As part of quarterly upgrades, the Operating Systems for all servers need to be upgraded.

For the control plane, there isn’t a “pool” so just patch each server and reboot it. Do one server at a time and check the status of the cluster before moving to subsequent master servers on the control plane.

For the worker nodes, you’ll need to drain each of the workers before patching and rebooting. Run the following command to both confirm the current version of 1.19.6 and that all nodes are in a Ready state to be patched:

$ kubectl get nodes
NAME                           STATUS   ROLES    AGE    VERSION
bldr0cuomknode1.internal.pri   Ready    <none>   214d   v1.19.6
bldr0cuomknode2.internal.pri   Ready    <none>   214d   v1.19.6
bldr0cuomknode3.internal.pri   Ready    <none>   214d   v1.19.6
bldr0cuomkube1.internal.pri    Ready    master   214d   v1.19.6
bldr0cuomkube2.internal.pri    Ready    master   214d   v1.19.6
bldr0cuomkube3.internal.pri    Ready    master   214d   v1.19.6

To drain a server, patch, and then return the server to the pool, follow the steps below.

kubectl drain [nodename] --delete-local-data --ignore-daemonsets

Then patch the server and reboot:

yum upgrade -yshutdown -t 0 now -r

Finally bring the node back into the pool.

kubectl uncordon [nodename]

Update Versionlock Information

Currently the clusters have locked kubernetes to version 1.19.6, kubernetes-cni to version 0.8.7, and docker to 1.13.1-203. The locks on each server need to be removed and new locks put in place for the new versions of kubernetes, kubernetes-cni, and docker where appropriate.

Versionlock file location: /etc/yum/pluginconf.d/

Simply delete the existing locks:

/usr/bin/yum versionlock delete "kubelet.*"
/usr/bin/yum versionlock delete "kubectl.*"
/usr/bin/yum versionlock delete "kubeadm.*"
/usr/bin/yum versionlock delete "kubernetes-cni.*"
/usr/bin/yum versionlock delete "docker.*"
/usr/bin/yum versionlock delete "docker-common.*"
/usr/bin/yum versionlock delete "docker-client.*"
/usr/bin/yum versionlock delete "docker-rhel-push-plugin.*"

And then add in the new locks at the desired levels:

/usr/bin/yum versionlock add "kubelet-1.20.6-0.*"
/usr/bin/yum versionlock add "kubectl-1.20.6-0.*"
/usr/bin/yum versionlock add "kubeadm-1.20.6-0.*"
/usr/bin/yum versionlock "docker-1.13.1-204.*"
/usr/bin/yum versionlock "docker-common-1.13.1-204.*"
/usr/bin/yum versionlock "docker-client-1.13.1-204.*"
/usr/bin/yum versionlock "docker-rhel-push-plugin-1.13.1-204.*"

Then install the updated kubernetes and docker binaries. Note that the versionlocked versions and the installed version must match:

/usr/bin/yum install kubelet-1.20.6-0.x86_64
/usr/bin/yum install kubectl-1.20.6-0.x86_64
/usr/bin/yum install kubeadm-1.20.6-0.x86_64
/usr/bin/yum install docker-1.13.1-204.git0be3e21.el7_8.x86_64
/usr/bin/yum install docker-common-1.13.1-204.git0be3e21.el7*
/usr/bin/yum install docker-client-1.13.1-204.git0be3e21.el7*
/usr/bin/yum install docker-rhel-push-plugin-1.13.1-204.git0be3e21.el7*

2.3 Upgrade Kubernetes

Using the kubeadm command on the first master server, you can review the plan and then upgrade the cluster:

# kubeadm upgrade plan
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade] Fetching available versions to upgrade to
[upgrade/versions] Cluster version: v1.19.6
[upgrade/versions] kubeadm version: v1.20.6
I0427 17:46:38.139615   20479 version.go:254] remote version is much newer: v1.21.0; falling back to: stable-1.20
[upgrade/versions] Latest stable version: v1.20.6
[upgrade/versions] Latest stable version: v1.20.6
[upgrade/versions] Latest version in the v1.19 series: v1.19.10
[upgrade/versions] Latest version in the v1.19 series: v1.19.10
 
Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT   CURRENT       AVAILABLE
kubelet     6 x v1.19.6   v1.19.10
 
Upgrade to the latest version in the v1.19 series:
 
COMPONENT                 CURRENT    AVAILABLE
kube-apiserver            v1.19.6    v1.19.10
kube-controller-manager   v1.19.6    v1.19.10
kube-scheduler            v1.19.6    v1.19.10
kube-proxy                v1.19.6    v1.19.10
CoreDNS                   1.7.0      1.7.0
etcd                      3.4.13-0   3.4.13-0
 
You can now apply the upgrade by executing the following command:
 
kubeadm upgrade apply v1.19.10
 
_____________________________________________________________________
 
Components that must be upgraded manually after you have upgraded the control plane with 'kubeadm upgrade apply':
COMPONENT   CURRENT       AVAILABLE
kubelet     6 x v1.19.6   v1.20.6
 
Upgrade to the latest stable version:
 
COMPONENT                 CURRENT    AVAILABLE
kube-apiserver            v1.19.6    v1.20.6
kube-controller-manager   v1.19.6    v1.20.6
kube-scheduler            v1.19.6    v1.20.6
kube-proxy                v1.19.6    v1.20.6
CoreDNS                   1.7.0      1.7.0
etcd                      3.4.13-0   3.4.13-0
 
You can now apply the upgrade by executing the following command:
 
kubeadm upgrade apply v1.20.6
 
_____________________________________________________________________
 
 
The table below shows the current state of component configs as understood by this version of kubeadm.
Configs that have a "yes" mark in the "MANUAL UPGRADE REQUIRED" column require manual config upgrade or
resetting to kubeadm defaults before a successful upgrade can be performed. The version to manually
upgrade to is denoted in the "PREFERRED VERSION" column.
 
API GROUP                 CURRENT VERSION   PREFERRED VERSION   MANUAL UPGRADE REQUIRED
kubeproxy.config.k8s.io   v1alpha1          v1alpha1            no
kubelet.config.k8s.io     v1beta1           v1beta1             no
_____________________________________________________________________

There are likely newer versions of Kubernetes control plane containers available. In order to maintain consistency across all clusters, only upgrade the masters to 1.19.6:

# kubeadm upgrade apply v1.20.6
[upgrade/config] Making sure the configuration is correct:
[upgrade/config] Reading configuration from the cluster...
[upgrade/config] FYI: You can look at this config file with 'kubectl -n kube-system get cm kubeadm-config -o yaml'
[preflight] Running pre-flight checks.
[upgrade] Running cluster health checks
[upgrade/version] You have chosen to change the cluster version to "v1.20.6"
[upgrade/versions] Cluster version: v1.19.6
[upgrade/versions] kubeadm version: v1.20.6
[upgrade/confirm] Are you sure you want to proceed with the upgrade? [y/N]: y
[upgrade/prepull] Pulling images required for setting up a Kubernetes cluster
[upgrade/prepull] This might take a minute or two, depending on the speed of your internet connection
[upgrade/prepull] You can also perform this action in beforehand using 'kubeadm config images pull'
[upgrade/apply] Upgrading your Static Pod-hosted control plane to version "v1.20.6"...
Static pod: kube-apiserver-bldr0cuomkube1.internal.pri hash: 2742aa8dcdc3cb47ed265f67f1a04783
Static pod: kube-controller-manager-bldr0cuomkube1.internal.pri hash: dd7adc86b875b67ba03820b12d904fa9
Static pod: kube-scheduler-bldr0cuomkube1.internal.pri hash: 6a43bc71ab534486758c1d56bd907ea3
[upgrade/etcd] Upgrading to TLS for etcd
Static pod: etcd-bldr0cuomkube1.internal.pri hash: 7e320baf6cd06f441f462de7da1d6f05
[upgrade/staticpods] Preparing for "etcd" upgrade
[upgrade/staticpods] Renewing etcd-server certificate
[upgrade/staticpods] Renewing etcd-peer certificate
[upgrade/staticpods] Renewing etcd-healthcheck-client certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/etcd.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2021-04-27-23-31-35/etcd.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: etcd-bldr0cuomkube1.internal.pri hash: 7e320baf6cd06f441f462de7da1d6f05
Static pod: etcd-bldr0cuomkube1.internal.pri hash: 7e320baf6cd06f441f462de7da1d6f05
Static pod: etcd-bldr0cuomkube1.internal.pri hash: 7e320baf6cd06f441f462de7da1d6f05
Static pod: etcd-bldr0cuomkube1.internal.pri hash: 7e320baf6cd06f441f462de7da1d6f05
...
[apiclient] Found 3 Pods for label selector component=etcd
[upgrade/staticpods] Component "etcd" upgraded successfully!
[upgrade/etcd] Waiting for etcd to become available
[upgrade/staticpods] Writing new Static Pod manifests to "/etc/kubernetes/tmp/kubeadm-upgraded-manifests040252515"
[upgrade/staticpods] Preparing for "kube-apiserver" upgrade
[upgrade/staticpods] Renewing apiserver certificate
[upgrade/staticpods] Renewing apiserver-kubelet-client certificate
[upgrade/staticpods] Renewing front-proxy-client certificate
[upgrade/staticpods] Renewing apiserver-etcd-client certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-apiserver.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2021-04-27-23-31-35/kube-apiserver.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-apiserver-bldr0cuomkube1.internal.pri hash: 2742aa8dcdc3cb47ed265f67f1a04783
Static pod: kube-apiserver-bldr0cuomkube1.internal.pri hash: 7426ddce1aafd033ae049eefb6d56b1e
[apiclient] Found 3 Pods for label selector component=kube-apiserver
[upgrade/staticpods] Component "kube-apiserver" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-controller-manager" upgrade
[upgrade/staticpods] Renewing controller-manager.conf certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-controller-manager.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2021-04-27-23-31-35/kube-controller-manager.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-controller-manager-bldr0cuomkube1.internal.pri hash: dd7adc86b875b67ba03820b12d904fa9
Static pod: kube-controller-manager-bldr0cuomkube1.internal.pri hash: 281525a644d92747499c625139b84436
[apiclient] Found 3 Pods for label selector component=kube-controller-manager
[upgrade/staticpods] Component "kube-controller-manager" upgraded successfully!
[upgrade/staticpods] Preparing for "kube-scheduler" upgrade
[upgrade/staticpods] Renewing scheduler.conf certificate
[upgrade/staticpods] Moved new manifest to "/etc/kubernetes/manifests/kube-scheduler.yaml" and backed up old manifest to "/etc/kubernetes/tmp/kubeadm-backup-manifests-2021-04-27-23-31-35/kube-scheduler.yaml"
[upgrade/staticpods] Waiting for the kubelet to restart the component
[upgrade/staticpods] This might take a minute or longer depending on the component/version gap (timeout 5m0s)
Static pod: kube-scheduler-bldr0cuomkube1.internal.pri hash: 6a43bc71ab534486758c1d56bd907ea3
Static pod: kube-scheduler-bldr0cuomkube1.internal.pri hash: aa70347866b81f5866423fcccb0c6aca
[apiclient] Found 3 Pods for label selector component=kube-scheduler
[upgrade/staticpods] Component "kube-scheduler" upgraded successfully!
[upgrade/postupgrade] Applying label node-role.kubernetes.io/control-plane='' to Nodes with label node-role.kubernetes.io/master='' (deprecated)
[upload-config] Storing the configuration used in ConfigMap "kubeadm-config" in the "kube-system" Namespace
[kubelet] Creating a ConfigMap "kubelet-config-1.20" in namespace kube-system with the configuration for the kubelets in the cluster
[kubelet-start] Writing kubelet configuration to file "/var/lib/kubelet/config.yaml"
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to get nodes
[bootstrap-token] configured RBAC rules to allow Node Bootstrap tokens to post CSRs in order for nodes to get long term certificate credentials
[bootstrap-token] configured RBAC rules to allow the csrapprover controller automatically approve CSRs from a Node Bootstrap Token
[bootstrap-token] configured RBAC rules to allow certificate rotation for all node client certificates in the cluster
[addons] Applied essential addon: CoreDNS
[addons] Applied essential addon: kube-proxy
 
[upgrade/successful] SUCCESS! Your cluster was upgraded to "v1.20.6". Enjoy!
 
[upgrade/kubelet] Now that your control plane is upgraded, please proceed with upgrading your kubelets if you haven't already done so.
Update Control Planes

On the second and third master, run the kubeadm upgrade apply v1.20.6 command and the control plane will be upgraded.

Update File and Directory Permissions

Verify the permissions match the table below once the upgrade is complete:

/etc/kubernetes/manifests/etcd.yamlroot:root0644
/etc/kubernetes/manifests/kube-apiserver.yamlroot:root0644
/etc/kubernetes/manifests/kube-controller-manager.yamlroot:root0644
/etc/kubernetes/manifests/kube-schedulerroot:root0644
/var/lib/etcdroot:root0700
/etc/kubernetes/admin.confroot:root0644
/etc/kubernetes/scheduler.confroot:root0644
/etc/kubernetes/controller-manager.confroot:root0644
/etc/kubernetes/pkiroot:root0755
/etc/kubernetes/pki/ca.crtroot:root0644
/etc/kubernetes/pki/apiserver.crtroot:root0644
/etc/kubernetes/pki/apiserver-kubelet-client.crtroot:root0644
/etc/kubernetes/pki/front-proxy-ca.crtroot:root0644
/etc/kubernetes/pki/front-proxy-client.crtroot:root0644
/etc/kubernetes/pki/sa.pubroot:root0644
/etc/kubernetes/pki/ca.keyroot:root0600
/etc/kubernetes/pki/apiserver.keyroot:root0600
/etc/kubernetes/pki/apiserver-kubelet-client.keyroot:root0600
/etc/kubernetes/pki/front-proxy-ca.keyroot:root0600
/etc/kubernetes/pki/front-proxy-client.keyroot:root0600
/etc/kubernetes/pki/sa.keyroot:root0600
/etc/kubernetes/pki/etcdroot:root0755
/etc/kubernetes/pki/etcd/ca.crtroot:root0644
/etc/kubernetes/pki/etcd/server.crtroot:root0644
/etc/kubernetes/pki/etcd/peer.crtroot:root0644
/etc/kubernetes/pki/etcd/healthcheck-client.crtroot:root0644
/etc/kubernetes/pki/etcd/ca.keyroot:root0600
/etc/kubernetes/pki/etcd/server.keyroot:root0600
/etc/kubernetes/pki/etcd/peer.keyroot:root0600
/etc/kubernetes/pki/etcd/healthcheck-client.keyroot:root0600

Update Manifests

During the kubeadm upgrade, the current control plane manifests are moved from /etc/kubernetes/manifests into /etc/kubernetes/tmp and new manifest files deployed. There are multiple settings and permissions that need to be reviewed and updated before the task is considered completed.

The kubeadm-config configmap has been updated to point to bldr0cuomrepo1.internal.pri:5000 however it and the various container configurations should be checked anyway. One of the issues is if it’s not updated or used, you’ll have to make the update manually including manually editing the kube-proxy daemonset configuration.

Note that when a manifest is updated, the associated image is reloaded. No need to manage the pods once manifests are updated.

etcd Manifest

Verify and update etcd.yaml

  • Change imagePullPolicy to Always.
  • Change image switching k8s.gcr.io with bldr0cuomrepo1.internal.pri:5000
kube-apiserver Manifest

Verify and update kube-apiserver.yaml

  • Add AlwaysPullImages and ResourceQuota admission controllers to the –enable-admission-plugins line
  • Change imagePullPolicy to Always.
  • Change image switching k8s.gcr.io with bldr0cuomrepo1.internal.pri:5000
kube-controller-manager Manifest

Verify and update kube-controller-manager.yaml

  • Add “ – –cluster-name=kubecluster-[site]” after “ – –cluster-cidr=192.168.0.0/16
  • Change imagePullPolicy to Always
  • Change image switching k8s.gcr.io with bldr0cuomrepo1.internal.pri:5000
kube-scheduler Manifest

Varify and update kube-scheduler.yaml

  • Change imagePullPolicy to Always.
  • Change image switching k8s.gcr.io with bldr0cuomrepo1.internal.pri:5000

Update kube-proxy

Verify where the kube-proxy images is being loaded from. If not the local repository, you’ll need to edit the kube-proxy daemonset to change the imagePullPolicy. Check the image tag at the same time.

kubectl edit daemonset kube-proxy -n kube-system
  • Change imagePullPolicy to Always
  • Change image switching k8s.gcr.io with bldr0cuomrepo1.internal.pri:5000

Save the changes.

Update coredns

Verify where the coredns images is being loaded from. If not the local repository, you’ll need to edit the coredns deployment to change the imagePullPolicy. Check the image tag at the same time.

kubectl edit deployment coredns -n kube-system
  • Change imagePullPolicy to Always
  • Change image switching k8s.gcr.io with bldr0cuomrepo1.internal.pri:5000

Save the changes.

Restart kubelet

Once done, kubelet and docker needs to be restarted on all nodes.

systemctl daemon-reload
systemctl restart kubelet
systemctl restart docker

Verify

Once kubelet has been restarted on all nodes, verify all nodes are at 1.20.6.

$ kubectl get nodes
NAME                           STATUS   ROLES                  AGE    VERSION
bldr0cuomknode1.internal.pri   Ready    <none>                 215d   v1.20.6
bldr0cuomknode2.internal.pri   Ready    <none>                 215d   v1.20.6
bldr0cuomknode3.internal.pri   Ready    <none>                 215d   v1.20.6
bldr0cuomkube1.internal.pri    Ready    control-plane,master   215d   v1.20.6
bldr0cuomkube2.internal.pri    Ready    control-plane,master   215d   v1.20.6
bldr0cuomkube3.internal.pri    Ready    control-plane,master   215d   v1.20.6

Configuration Upgrades

Configuration files are on the tool servers (lnmt1cuomtool11) in the /usr/local/admin/playbooks/cschelin/kubernetes/configurations directory and the expectation is you’ll be in that directory when directed to apply configurations.

Calico Upgrade

In the calico directory, run the following command:

$ kubectl apply -f calico.yaml
configmap/calico-config unchanged
customresourcedefinition.apiextensions.k8s.io/bgpconfigurations.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/bgppeers.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/blockaffinities.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/clusterinformations.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/felixconfigurations.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/globalnetworkpolicies.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/globalnetworksets.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/hostendpoints.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/ipamblocks.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/ipamconfigs.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/ipamhandles.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/ippools.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/kubecontrollersconfigurations.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/networkpolicies.crd.projectcalico.org configured
customresourcedefinition.apiextensions.k8s.io/networksets.crd.projectcalico.org configured
clusterrole.rbac.authorization.k8s.io/calico-kube-controllers configured
clusterrolebinding.rbac.authorization.k8s.io/calico-kube-controllers unchanged
clusterrole.rbac.authorization.k8s.io/calico-node unchanged
clusterrolebinding.rbac.authorization.k8s.io/calico-node unchanged
daemonset.apps/calico-node configured
serviceaccount/calico-node unchanged
deployment.apps/calico-kube-controllers configured
serviceaccount/calico-kube-controllers unchanged
poddisruptionbudget.policy/calico-kube-controllers unchanged

After calico.yaml is applied, the calico-kube-controllers pod will restart and then the calico-node pod restarts to retrieve the updated image.

Pull the calicoctl binary and copy it to /usr/local/bin, then verify the version. Note that this has likely already been done on the tool server. Verify it before pulling the binary.

$ curl -O -L  https://github.com/projectcalico/calicoctl/releases/download/v3.18.2/calicoctl
% Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
Dload  Upload   Total   Spent    Left  Speed
100   615  100   615    0     0    974      0 --:--:-- --:--:-- --:--:--   974
100 38.1M  100 38.1M    0     0  1505k      0  0:00:25  0:00:25 --:--:-- 1562k
Verification
$ calicoctl version
Client Version:    v3.18.2
Git commit:        528c5860
Cluster Version:   v3.18.2
Cluster Type:      k8s,bgp,kubeadm,kdd
Update CNI File Permissions

Verify the permissions of the files once the upgrade is complete.

/etc/cni/net.d/10-calico-conflistroot:root644
/etc/cni/net.d/calico-kubeconfigroot:root644

metrics-server Upgrade

In the metrics-server directory, run the following command:

$ kubectl apply -f components.yaml
serviceaccount/metrics-server unchanged
clusterrole.rbac.authorization.k8s.io/system:aggregated-metrics-reader unchanged
clusterrole.rbac.authorization.k8s.io/system:metrics-server unchanged
rolebinding.rbac.authorization.k8s.io/metrics-server-auth-reader unchanged
clusterrolebinding.rbac.authorization.k8s.io/metrics-server:system:auth-delegator unchanged
clusterrolebinding.rbac.authorization.k8s.io/system:metrics-server unchanged
service/metrics-server unchanged
deployment.apps/metrics-server configured
apiservice.apiregistration.k8s.io/v1beta1.metrics.k8s.io unchanged

Once the metrics-server deployment is updated, the pod will restart.

kube-state-metrics Upgrade

In this case, we’ll be applying the entire directory so from the configurations directory, apply the following command:

$ kubectl apply -f kube-state-metrics/
clusterrolebinding.rbac.authorization.k8s.io/kube-state-metrics configured
clusterrole.rbac.authorization.k8s.io/kube-state-metrics configured
deployment.apps/kube-state-metrics configured
serviceaccount/kube-state-metrics configured
service/kube-state-metrics configured

Once the kube-state-metrics deployment is updated, the pod will restart.

Filebeat Upgrade

Filebeat uses Elastic Stack clusters in four environments. Filebeat itself is installed on all clusters. Ensure you’re managing the correct cluster when upgrading the filebeat container as configurations are specific to each cluster.

Change to the appropriate cluster context directory and run the following command:

$ kubectl apply -f filebeat-kubernetes.yaml
configmap/filebeat-config configured
daemonset.apps/filebeat configured
clusterrolebinding.rbac.authorization.k8s.io/filebeat unchanged
clusterrole.rbac.authorization.k8s.io/filebeat configured
serviceaccount/filebeat unchanged
Verification

Essentially monitor each cluster. You should see the filebeat containers restarting and returning to a Running state.

$ kubectl get pods -n monitoring -o wide
Posted in Computers, Kubernetes | Tagged , | Leave a comment

Kubernetes Ansible Upgrade to 1.20.6

Upgrading Kubernetes Clusters

This document provides a guide to upgrading the Kubernetes clusters in the quickest manner. Much of the upgrade process can be done using Ansible Playbooks. There are a few processes that need to be done centrally on the tool server. And the OS and control plane updates are also manual in part due to the requirement to manually remove servers from the Kubernetes API pool.

In most cases, examples are not provided as it is assumed that you are familiar with the processes and can perform the updates without having to be reminded of how to verify.

For any process that is performed with an Ansible Playbook, it is assumed you are on the lnmt1cuomtool11 server in the /usr/local/admin/playbooks/cschelin/kubernetes directory. All Ansible related steps expect to start from that directory. In addition, the application of pod configurations will be in the configurations subdirectory.

Perform Upgrades

Patch Servers

In the 00-osupgrade directory, you’ll be running the master and worker scripts. I recommend opening two windows, one for master and one for worker, and running each script with master -t [tag] and worker -t [tag]. This will verify a node is Ready, drain the node from the pool if a worker, perform a yum upgrade and reboot, uncordon again if a worker, and verify the nodes are Ready again. Should a node fail to be ready in time, the script will exit.

Update Versionlock

In the 03-packages directory, run the update -t [tag] script. This will install yum-plugin-versionlock if missing, remove old versionlocks, create new versionlocks for kubernetes, kubernetes-cni, and docker, and then the components will be upgraded.

Upgrade Kubernetes

Using the kubeadm command on the first master server, upgrade the first master server.

# kubeadm upgrade apply v1.20.6
Update Control Planes

On the second and third master, run the kubeadm upgrade apply v1.20.6 command and the control plane will be upgraded.

Update kube-proxy

Check the kube-proxy daemonset and update the image tag if required.

kubectl edit daemonset kube-proxy -n kube-system
  • Change image switching k8s.gcr.io with bldr0cuomrepo1.internal.pri:5000

Save the changes.

Update coredns

Check the coredns deployment and update the image tag if required.

kubectl edit deployment coredns -n kube-system
  • Change image switching k8s.gcr.io with bldr0cuomrepo1.internal.pri:5000

Save the changes.

Restart kubelet and docker

In the 04-kubelet directory, run the update -t [tag] script. This will restart kubelet and docker on all servers.

Calico Upgrade

In the configurations/calico directory, run the following command:

$ kubectl apply -f calico.yaml

calicoctl Upgrade

Pull the updated calicoctl binary and copy it to /usr/local/bin. It’s likely already there but verify.

$ curl -O -L  https://github.com/projectcalico/calicoctl/releases/download/v3.18.2/calicoctl

kube-state-metrics Upgrade

In the configurations directory, /kube-state-metrics directory, run the following command:

$ kubectl apply -f kube-state-metrics/

metrics-server Upgrade

In the configurations/metrics-server directory, run the following command:

$ kubectl apply -f components.yaml

Filebeat Upgrade

In the configurations directory, change to the appropriate cluster context directory (bldr0-0, cabo0-0, tato0-1, and lnmt1-2) and run the following command:

$ kubectl apply -f filebeat-kubernetes.yaml

Update File and Directory Permissions and Manifests

In the postinstall directory, run the update -s [site] script. This will perform the following steps.

  • Add the cluster-name to the kube-controller-manager.yaml file
  • Update the imagePullPolicy and image lines to all manifests
  • Add the AlwaysPullImages and ResourceQuota admission controllers to the kube-apiserver.yaml file
  • Update the permissions of all files and directories.
Posted in Computers, Kubernetes | Tagged , | Leave a comment

Ansible Tags – A Story

Started a new job back in October. The team is just me and another guy and the boss. And the other guy quit in December.

The real good thing is it’s a small single project shop and pretty much all the server work is done with Ansible so lots of playbooks. Of course the bad thing is it’s just me so I’m dissecting the playbooks to see what the previous folks did and why.

One of the things is the use of Tags. There are defined tags in several places but in the calling playbook and apparently not used when running the playbook or in the roles. It’s not defined in any documentation (what little there is) and the playbooks themselves don’t seem to need the tags.

I pulled up the Ansible docs on tags, checked a couple of youtube videos and an O’Reilly book and really didn’t see a need for Tags. Anything large enough where Tags might be useful probably should be broken down into smaller tasks anyway.

Then the boss made a request. We’re changing the IPs in the load balancer and the load balancer IP and I’d like it done via Ansible.

My first attempt was a task with a list of old IPs and a second task with a list of the new IPs. Use with_items and go. Added a backout task in case there was a problem that just reversed the lists.

Boss updated the request. We bring down Side A first, test to make sure it’s good, then Side B. A sequential list of tasks vs just delete and add. Okay, let’s see…

Started creating a bunch of little playbooks in part because of a manual check between changes.

  • Remove Side A from the Load Balancer
  • Remove the old IP from Side A
  • Add the new IP to Side A
  • Validate
  • Add Side A back to the Load Balancer
  • Remove Side B from the Load Balancer
  • Remove the old IP from Side B
  • Add the new IP to Side B
  • Validate
  • Add Side B back to the Load Balancer
  • Validate

So three playbooks. Well, let’s not forget creating similar playbooks to back out the change in case Validate == Failed. So three more playbooks. Plus a couple of edge cases. For example, if Side A is fine but there’s some network issue with Side B, backing out Side B might mean three of the backout tasks can be run but we’d want to leave the new Side A in the Load Balancer.

That’s a lot of playbooks.

Hey, Tags! Create one Update playbook and tag the tasks appropriately. Then a second Backout playbook and tag those tasks. Then run the Update playbook with –tags delsidealb,delsidea,addsidea.

So not necessarily a long playbook but also for a bunch of simple tasks that need backouts and manual verifications.

Well, I thought it was cool 🙂 Learning new things is always fun and I thought I’d share.

Posted in ansible, Computers | Tagged , | Leave a comment

Ansible Tags

Overview

Simply enough, Ansible Tags let you run specific tasks in a play. If you have a lengthy playbook or are testing tasks within a playbook, you can assign tags to tasks that let you run a specific task vs the entire playbook.

This is simply a summary of the uses of Ansible Tags. More of a cheat sheet than trying to instruct you in how to use Ansible Tags. The Ansible Tags Documentation is fairly short and does a good job explaining how to use Ansible Tags.

Uses

Examples

$ ansible-playbook -i inventory dns-update.yaml --tags bind9               # only run tasks tagged with bind9
$ ansible-playbook -i inventory dns-update.yaml --skip-tags bind9          # run all tasks except the ones tagged with bind9
$ ansible-playbook -i inventory dns-update.yaml --tags "bind9,restart"     # run tasks tagged with bind9 and restart
$ ansible-playbook -i inventory dns-update.yaml --tags untagged            # only run untagged tasks
$ ansible-playbook -i inventory dns-update.yaml --tags tagged              # only run tagged tasks
# ansible-playbook -i inventory dns-update.yaml --tags all                 # run all tasks (default)

You can assign a tag to one or more tasks.

Tasks can have multiple tags.

When you create a block of tasks, you can assign a tag to that block and all tasks within the block are run when the tag is used.

An interesting idea might be to add a debug tag to all the debug statements in your playbooks and then when ready to run live, pass the –skip-tags debug flag to the playbook. Then only the tasks are executed.

Special Tags

If you assign an always tag to a task, it will always run no matter what the passed –tags value is unless you specially pass –skip-tags always.

If you assign a never tag to a task, it will not run unless you call it out specifically. Something like calling the playbook with –tags all,never.

Tag Inheritance

There are two types of statements that add tasks. A Dynamic include_role, include_tasks, and include_vars, and a Static import_role and import_tasks.

If you tag a task that contains an include_role or include_tasks function, only tasks within that included file that are similarly tagged will run when the tag is passed.

If you tag a task that contains an import_role or import_tasks function, all tasks within that imported file will be run when the tag is passed.

Listing Tags

By using the –list-tags option to ansible-playbooks, it lists all the tags and exits the playbook without running anything.

References

There are several sites that provide information on tags but the obvious one is the Ansible Documentation

Posted in ansible, Computers | Tagged , | Leave a comment

Ansible Handlers

Overview

Ansible Handlers are tasks that are only performed when a calling task has successfully changed something.

Updating Docker

Say for example you want to try and update docker. There isn’t always an update available but if there is an update and the server is updated, docker needs to be restarted.

In the roles/docker/tasks directory, the main.yaml file looks like:

---
- name: update docker
  yum:
    name: docker
    state: latest
  notify:
  - Restart docker

In the roles/docker/handlers directory, the main.yaml file looks like:

---
- name: Restart docker
  systemd:
    daemon_reload: yes
    name: docker
    state: restarted

Notice in the first code block the notify line followed by the Handler to call, Restart docker. Note that only one task with that name can exist in the namespace and if called, only the last named Handler will be called and only one time.

If docker can be updated, the following example shows the results. Note the changed at the start of the lines indicating an upgrade or restart has occurred. The ok indicates no change was performed.

PLAY [kube-bldr0-0-worker] ***********************************************************************************************************************************
 
TASK [Gathering Facts] ***************************************************************************************************************************************
ok: [bldr0cuomknode1]
ok: [bldr0cuomknode3]
ok: [bldr0cuomknode2]
 
TASK [docker : update docker] ***************************************************************************************************************************
changed: [bldr0cuomknode1]
changed: [bldr0cuomknode2]
changed: [bldr0cuomknode3]
 
RUNNING HANDLER [docker : Restart docker] *********************************************************************************************************************
changed: [bldr0cuomknode1]
changed: [bldr0cuomknode2]
changed: [bldr0cuomknode3]
 
NO MORE HOSTS LEFT *******************************************************************************************************************************************
 
PLAY RECAP ***************************************************************************************************************************************************
bldr0cuomknode1            : ok=3    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
bldr0cuomknode2            : ok=3    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
bldr0cuomknode3            : ok=3    changed=2    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

If you run the playbook again, the Handler will not be called as an update isn’t necessary. Again, note the ok at the start of the lines indicating no change occurred.

PLAY [kube-bldr0-0-worker] ***********************************************************************************************************************************
 
TASK [Gathering Facts] ***************************************************************************************************************************************
ok: [bldr0cuomknode1]
ok: [bldr0cuomknode2]
ok: [bldr0cuomknode3]
 
TASK [docker : update docker] ***************************************************************************************************************************
ok: [bldr0cuomknode1]
ok: [bldr0cuomknode2]
ok: [bldr0cuomknode3]
 
PLAY RECAP ***************************************************************************************************************************************************
bldr0cuomknode1            : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
bldr0cuomknode2            : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0
bldr0cuomknode3            : ok=2    changed=0    unreachable=0    failed=0    skipped=0    rescued=0    ignored=0

References

Posted in ansible, Computers | Tagged , | Leave a comment

Configuring Katello 3.15

I’m currently using Spacewalk to manage patches on my servers however it doesn’t support CentOS 8 and I suspect there’ll be more and more issues. Plus my previous job used Satellite and the current one wants me to install Katello so I’ve installed it on one of my VMs.

The idea here is to get the main configuration done in such as way as I’m ready to configure Products, Repositories, and start Syncing to be ready to update all the servers. Most of the Katello instructions are along the lines of how to use the various options but not order or things to make sure are done.

Katello already has an Organization when you installed it. You’ll likely want to rename it to something appropriate.

Click on Administer -> Organizations. Click on the Default one to edit it and rename it. Then click Submit.

Next, add the various locations. I have multiple pseudo data centers that align with environments. Click on Administer -> New Location, enter in the Name and a Description, and click Submit. When the Detail page is displayed, click on Organizations and add the new organization to the location.

Once Organizations is edited and all the Locations are added, the rest of the configurations will have you adding them to Subnets and Domains.

Next up, add the various Puppet Environments. Click on Configure -> Environments and click on the Create Puppet Environment button to create a new Environment. Enter in the Name, click on Locations and add the Environment to the correct Location then make sure the Environment is associated with the Organization. Click Submit to save your changes.

Verify the Domain by clicking on Infrastructure -> Domains. Add a Description if missing, associate the Domain with Locations and the Organization, and then click Submit to save.

Next, add the necessary Subnets. This is a bit more involved. Click Infrastructure -> Subnets then click on the Create Subnet button.

Fill out the Name of the Subnet, a Description, Network Address and Prefix, Mask, Gateway Address, DNS Servers, and VLAN ID. You can select an IPAM option and Katello will try to anticipate the IP Addresses when they’re added. Change to the Domain tab and update that. Click on Locations and add the necessary ones. And make sure the Subnet is added to the correct Organization.

One last step is to go through the above settings and make sure all the fields are filled in. Organization, Locations, Domains, etc.

Posted in Computers | Tagged , , | Leave a comment