1959 Commits

Author SHA1 Message Date
Max Gautier
88df61357b Use run_once for kubeadm init phase upload-certs (#12759)
Before "5ca23e3bf (Changed to use first_kube_control_plane to parse
kubeadm_certificate_key (#11875), 2025-01-14)", kubespray would have
problem adding new control planes when the order of the nodes in kubectl
output and the ansible inventory were not the same.

But the underlying problem is that the operation is fundamentally
something that should be done only once, and recorded for all host in
play.

Since `register` and `sef_fact` when used with `run_once` set the
variable for all the hosts, use it. Also allows to use the variable
directly instead of relying on hostvars to make the task more readable.
2025-12-07 21:07:27 -08:00
Max Gautier
633d39448e Add a default (empty) value for supplementary_addresses_in_ssl_keys (#12761)
Most variables should have a default instead of relying on the default
filter.

(Note that the variable is misnomed, this should be certs and not keys,
but it's not worth breaking compat).
2025-12-04 05:28:57 -08:00
Max Gautier
4d87ac1032 Simplify collection of SubjectAlternativeNames for apiserver (#12507)
Remove a bunch of intermediate variables, which fixes a
"'UndefinedMarker' concatenation" error in ansible-lint v25.8.1.
2025-12-04 02:06:57 -08:00
Seena Fallah
5789dc839c control-plane: fix first_kube_control_plane delegation with kube_override_hostname (#12636)
* control-plane: fix first_kube_control_plane delegation with kube_override_hostname

When kube_override_hostname is configured, the node names reported by
`kubectl get nodes` differ from the inventory_hostname known to Ansible.
This causes delegation failures in subsequent tasks since Ansible cannot
resolve the hostname from kubectl output to an inventory host.

Signed-off-by: Seena Fallah <seenafallah@gmail.com>

* control-plane: remove fragile first_control_plane selection logic

Current implementation breaks with kube_override_hostname and has
multiple edge cases. Drop until proper kubectl-based node lookup
can be implemented.

Signed-off-by: Seena Fallah <seenafallah@gmail.com>

---------

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
2025-11-25 08:10:38 -08:00
Max Gautier
5acde6cfe2 Get conf checksum directly for localhost CP loadbalancer (#12632)
There is no need to stat the templated file, because the template module
already returns a checksum.
2025-10-23 22:57:36 -07:00
Max Gautier
fbf957ab5d Fix breakage when ignoring all kubeadm preflight errors (#12606)
kubeadm errors out if 'all' is specified with specific checks, so check
that case when we add hardcoded checks.

Add a test to catch regression.
2025-10-13 05:54:58 -07:00
Mahendra Reddy
e39e005306 bugfix: skip etcd cert extraction if cilium identity uses crd (#12565)
* bugfix: skip etcd cert extraction if cilium identity uses crd

* remove new line end of the file
2025-10-09 00:31:00 -07:00
David Bidorff
fe7592dd0c fix: provide an option to ignore sysctl errors about unknown keys (#12514)
* fix: provide an option to ignore sysctl errors about unknown keys

* fix: rename sysctl_ignoreerrors and remove useless var definitions
2025-09-01 07:07:14 -07:00
Hyeonki Hong
56c830713e Fix SAN address collection from ansible_default_ipv{4,6} (#12413)
Signed-off-by: Hyeonki Hong <hhk7734@gmail.com>
2025-08-26 02:40:11 -07:00
Alejandro Macedo
e2046749ac Fix: Change "empty" definition for PodSecurity Admission configuration (#12439)
Fixes a bug where `kube-apiserver` fails to start if the PodSecurity
configuration file doesn't have the `apiVersion` and `kind` keys.

Signed-off-by: Alejandro Macedo <alex.macedopereira@gmail.com>
2025-08-19 02:57:36 -07:00
Max Gautier
f832271f5c Directly list conntrack modules instead of using a variable (#12475)
The conntrack kernel modules have no reason to be something else than
those two options, so there is no reason to have a variable.
2025-08-18 09:05:13 -07:00
Elias Probst
dc9d3bf39d Fix when expr of conntrack module loading (#12458)
Retrying to load conntrack modules was bound to fail due to the way, the current `when` conditions were utilized.
It was based on the assumption, that in case of success, the registered variable would have an `rc` attribute with the value `0`.
Unfortunately, the `rc` attribute is only present in case of a failure, where it's value is >1.

The result of `community.general.modprobe` in case of success looks like this:
```
{
    "changed": false,
    "msg": "All items completed",
    "results": [
        {
            "ansible_loop_var": "item",
            "changed": false,
            "failed": false,
            "invocation": {
                "module_args": {
                    "name": "nf_conntrack",
                    "params": "",
                    "persistent": "present",
                    "state": "present"
                }
            },
            "item": "nf_conntrack",
            "name": "nf_conntrack",
            "params": "",
            "state": "present"
        }
    ],
    "skipped": false
}
```

While it looks like this in case of a failure:
```
{
    "changed": false,
    "failed": true,
    "msg": "One or more items failed",
    "results": [
        {
            "ansible_loop_var": "item",
            "attempts": 3,
            "changed": false,
            "failed": true,
            "invocation": {
                "module_args": {
                    "name": "nf_conntrack_doesnotexist",
                    "params": "",
                    "persistent": "present",
                    "state": "present"
                }
            },
            "item": "nf_conntrack_doesnotexist",
            "msg": "modprobe: FATAL: Module nf_conntrack_doesnotexist not found in directory /lib/modules/5.14.0-570.32.1.el9_6.x86_64\n",
            "name": "nf_conntrack_doesnotexist",
            "params": "",
            "rc": 1,
            "state": "present",
            "stderr": "modprobe: FATAL: Module nf_conntrack_doesnotexist not found in directory /lib/modules/5.14.0-570.32.1.el9_6.x86_64\n",
            "stderr_lines": [
                "modprobe: FATAL: Module nf_conntrack_doesnotexist not found in directory /lib/modules/5.14.0-570.32.1.el9_6.x86_64"
            ],
            "stdout": "",
            "stdout_lines": []
        }
    ],
    "skipped": false
}
```

By evaluating `failed` instead, this issue can be prevented.
See also:
- https://github.com/kubernetes-sigs/kubespray/issues/11340

Co-authored-by: Max Gautier <mg@max.gautier.name>
2025-08-18 08:17:10 -07:00
Shaleen Bathla
082507cff2 kubelet: conditionalize staticPodPath location (#12433)
Add variable to set kubelet staticPodPath location.
It can be set to empty so that we can choose to disable it for some nodes.
STIG recommendation is to disable it.

Signed-off-by: Shaleen Bathla <shaleenbathla@gmail.com>
Co-authored-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-08-18 06:51:11 -07:00
Ho Kim
707616178e feat: add support for custom kubeadm pull image repository (#12128)
Signed-off-by: Ho Kim <ho.kim@ulagbulag.io>
2025-08-13 18:03:06 -07:00
Aman Shrivastava
b4d3be482f Make control plane health check retries configurable (#12452) 2025-08-11 02:23:07 -07:00
Psycho Mantys
5b1334102b Remove --auth-anonymous if kube_api_anonymous_auth is undefined. (#12353)
Remove --auth-anonymous if kube_api_anonymous_auth in undefined, to avoid
compatibility errors with other arguments of the kube-apiserver, such as
--authentication-config when anonymous field is configured.
2025-07-26 20:20:27 -07:00
Chad Swenson
1e523a267c Fix kubeadm upgrade node skipPhases with multiple CP nodes (#12367)
Add 1.32 conditional defaults

Restore support for kubeadm upgrade node --skip-phases < 1.32, apply still needs to be restricted
2025-07-07 11:29:26 -07:00
Max Gautier
15c8a4768d Do not alter etc/hosts (#12382)
This is no longer needed, likely for a long time.
2025-07-07 04:53:26 -07:00
vdveldet
8d7cbe732e Adding proper quotation (#12371)
* Adding proper quotation

* Update file with correct quotes
2025-07-06 02:33:24 -07:00
ERIK
7ead3e2f11 fix(kubeadm): Conditionally add --skip-phases flag for v1.32.0+ (#12351)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2025-06-28 05:12:28 -07:00
_xat_
dca2a5ecb3 Skip kube-proxy addon phase during kubeadm upgrade if disabled (#12306) 2025-06-18 03:48:52 -07:00
Kubernetes Prow Robot
ad31de4220 Merge pull request #12132 from tico88612/fix/remove-anonymous-kubeadm-validation
Fix: kubeadm secondary use file discovery validation
2025-06-15 05:48:56 -07:00
ChengHao Yang
f77aea13e9 Cleanup: kubeadm-config v1beta4 extra args defined conditions (#12307)
* Cleanup: kubeadm-config v1beta4 extra args defined conditions

Some variables have already been defined, so there is no need to
useconditional statements to check whether they have been defined.

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Cleanup: cloud-provider extra args

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

---------

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-14 13:38:56 -07:00
Max Gautier
f6d29a27fc Remove stale TODOs (#12298)
Upstream consider it working as expected, won't fix
https://github.com/ansible-collections/community.general/issues/7717#issuecomment-2061880929
2025-06-12 20:14:57 -07:00
ChengHao Yang
ac0b0e7d6e Fix: upgrade cluster discovery kubeconfig not found
When installing or upgrading in the past, there was no validation
config. Check if the file exists first to prevent subsequent validation
errors.

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-12 10:05:59 +08:00
ChengHao Yang
e618d71f2a Fix: kubeadm secondary use file discovery validation
The validation step is moved to the end to avoid the loss of files that
may lead to verification failure.

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-12 10:05:58 +08:00
Peter Pan
85b0be144a Fix: check expiry before do breaking renew and container restart actions (#12194)
* Fix: check expiraty before renew

Since certificate renewal and container restarts involve higher risks,
they should be executed with extra caution.

* squash to Fix: check expiraty before renew

* squash to Fix: address more comments from VannTen

Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>

---------

Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
2025-06-05 01:04:41 -07:00
Imran Ahmed
ce26f17e9e fix unquoted san cert causing issues with ips (#12256) 2025-06-02 22:50:38 -07:00
ERIK
3454cd2c69 feat: Support certificate validity period config in kubeadm v1beta4 (#12272)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2025-06-02 20:44:37 -07:00
Kubernetes Prow Robot
c7c3d2ba95 Merge pull request #12163 from VannTen/cleanup/etcd_inv_sample
Move etcd inventory sample doc to role defaults
2025-05-26 03:16:16 -07:00
Max Gautier
92e8ac9de2 Remove tag 'master' (#12228)
* Remove tag master

Following it's deprecation in 4b324cb0f (Rename master to control plane
- non-breaking changes only (#11394), 2024-09-06)

* Add fail fast path when using removed tags

- Used for the master tag, but this could be used for other things in
  the future
2025-05-22 01:20:36 -07:00
Anshuman Agarwala
73b3e9b557 Removed weave support (#12230) 2025-05-22 01:10:36 -07:00
Max Gautier
9c2bdeec63 Decouple etcd defaults in a separate role
This allows us to reuse the defaults in other places without putting
everything in kubespray-defaults.

In that, for kubernetes/control-plane.
2025-05-16 14:51:29 +02:00
ERIK
e4c0c427a3 improve NTP package conflict handling (#12212)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2025-05-16 03:55:14 -07:00
Takuya Murakami
c6dfe22a41 Improve logging of kubeadm init failure of first control plane node (#12216)
Split retry task of 'kubeadm init' to show the failure log of
the first execution.
2025-05-16 03:01:13 -07:00
Hyeonki Hong
2c3b6c9199 feat: add trigger to restart kube-apiserver when config files change (#12172)
* feat: add trigger to restart kube-apiserver when config files change

* fix: remove not upgrade_cluster_setup condition

* refactor: streamline kube-apiserver restart notifications
2025-05-15 06:51:14 -07:00
Max Gautier
47508d5c6e Rename bootstrap-os to bootstrap_os
Role names in ansible collections should not have hyphens.
2025-05-13 09:39:54 +02:00
Max Gautier
fa6888df4c kubernetes_audit: Remove redundant defaults filter (#12208) 2025-05-12 07:23:14 -07:00
Kubernetes Prow Robot
0c8dfb8e43 Merge pull request #12185 from VannTen/cleanup/iproute_with_the_rest
Move package installation to bootstrap-os
2025-05-09 20:49:14 -07:00
Chad Swenson
76707073c4 Fix indentation on AuthorizationConfiguration task (#12197) 2025-05-09 00:05:19 -07:00
Max Gautier
d6d87e9a83 Move cilium_deploy_additionnaly to kubespray-default (#12191)
Instead of using default(false) all over the place, use
kubespray-defaults
2025-05-07 05:05:17 -07:00
Max Gautier
9631b5fd44 Move etcd inventory sample doc to role defaults 2025-05-04 21:24:26 +02:00
Max Gautier
a7d681abff Install iputils with other packages 2025-05-04 21:22:49 +02:00
Max Gautier
1e79c7b3cb Move package install to bootstrap-os 2025-05-04 21:22:48 +02:00
Max Gautier
87726faab4 Move check 'sorted pkgs list to pre-commit'
This is a lint check, which should not live in the playbook itself.
2025-05-04 21:22:47 +02:00
Max Gautier
1b9919547a Split 'offline' assert into their own role
The preinstall assert cover a number of things, many of which depends
only on the inventory, and can be run without any ansible_facts
collected.

Split them off to simplify re-ordering.
2025-05-04 21:22:46 +02:00
Ho Kim
c47711c2f2 fix: correct indent of cpuManagerPolicyOptions (#12123) 2025-05-02 00:27:56 -07:00
Aviral Agarwal
1da9f0dec4 Fixed kube-vip to use kube-vip/kube-vip-iptables image instead of kube-vip/kube-vip when lb_fwdmethod or kube_vip_lb_fwdmethod is set to masquerade (#12145) 2025-04-24 15:54:30 -07:00
ChengHao Yang
4ce5510c1a [rbd-provisioner] deprecate outdated application and documentation (#12114)
* Cleanup: deprecate rbd-provisioner application

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Docs: remove rbd-provisioner application

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

---------

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-04-08 06:22:44 -07:00
ChengHao Yang
8032b8281d [cephfs-provisioner] deprecate outdated application and documentation (#12113)
* Cleanup: deprecated CephFS application

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Docs: Remove CephFS Application

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

---------

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-04-08 03:08:39 -07:00