Compare commits

...

549 Commits

Author SHA1 Message Date
github-actions[bot]
f5fefdeeac Patch versions updates 2025-12-07 03:00:50 +00:00
Max Gautier
118b2dce02 Remove checksums for old versions of various components (#12735)
We only keep 3 minor versions for most stuff
2025-12-05 06:30:59 -08:00
Ali Afsharzadeh
4c5eda9f1e Remove legacy tasks that were scheduled for cleanup (#12765)
Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
2025-12-04 23:50:58 -08:00
Max Gautier
2512e0c50c Patch versions updates (#12762)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-12-04 06:44:57 -08:00
Max Gautier
633d39448e Add a default (empty) value for supplementary_addresses_in_ssl_keys (#12761)
Most variables should have a default instead of relying on the default
filter.

(Note that the variable is misnomed, this should be certs and not keys,
but it's not worth breaking compat).
2025-12-04 05:28:57 -08:00
Max Gautier
4d87ac1032 Simplify collection of SubjectAlternativeNames for apiserver (#12507)
Remove a bunch of intermediate variables, which fixes a
"'UndefinedMarker' concatenation" error in ansible-lint v25.8.1.
2025-12-04 02:06:57 -08:00
Chris Ricker
2342d0cd57 Calico: populate kubernetes-services-endpoint for localhost LB (#12598)
When loadbalancer_apiserver_localhost is enabled, Calico falls back to the
Kubernetes service IP because the kubernetes-services-endpoint ConfigMap is
empty. CNI then fails to reach the API server even though an nginx proxy is
listening on localhost.

Update kube_apiserver_global_endpoint to always reference the localhost load
balancer (respecting the configured port) and populate the ConfigMap for both
eBPF and localhost LB modes.
2025-12-03 07:22:19 -08:00
Azhan Latif
e6a5266bad feat: add noCompileLinks option to Docsify configuration (#12751) 2025-12-02 08:24:26 -08:00
dependabot[bot]
57f7c44718 build(deps): bump redhat-plumbers-in-action/advanced-issue-labeler (#12756)
Bumps [redhat-plumbers-in-action/advanced-issue-labeler](https://github.com/redhat-plumbers-in-action/advanced-issue-labeler) from 3.2.3 to 3.2.4.
- [Release notes](https://github.com/redhat-plumbers-in-action/advanced-issue-labeler/releases)
- [Commits](e38e6809c5...b80ae64e3e)

---
updated-dependencies:
- dependency-name: redhat-plumbers-in-action/advanced-issue-labeler
  dependency-version: 3.2.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-12-01 06:54:26 -08:00
Seena Fallah
5789dc839c control-plane: fix first_kube_control_plane delegation with kube_override_hostname (#12636)
* control-plane: fix first_kube_control_plane delegation with kube_override_hostname

When kube_override_hostname is configured, the node names reported by
`kubectl get nodes` differ from the inventory_hostname known to Ansible.
This causes delegation failures in subsequent tasks since Ansible cannot
resolve the hostname from kubectl output to an inventory host.

Signed-off-by: Seena Fallah <seenafallah@gmail.com>

* control-plane: remove fragile first_control_plane selection logic

Current implementation breaks with kube_override_hostname and has
multiple edge cases. Drop until proper kubectl-based node lookup
can be implemented.

Signed-off-by: Seena Fallah <seenafallah@gmail.com>

---------

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
2025-11-25 08:10:38 -08:00
Max Gautier
3de6fa7220 Patch versions updates (#12743)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-11-25 01:04:37 -08:00
dependabot[bot]
9a9e8814e6 build(deps): bump peter-evans/create-pull-request from 7.0.8 to 7.0.9 (#12741)
Bumps [peter-evans/create-pull-request](https://github.com/peter-evans/create-pull-request) from 7.0.8 to 7.0.9.
- [Release notes](https://github.com/peter-evans/create-pull-request/releases)
- [Commits](271a8d0340...84ae59a2cd)

---
updated-dependencies:
- dependency-name: peter-evans/create-pull-request
  dependency-version: 7.0.9
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-24 02:58:38 -08:00
dependabot[bot]
87a4f61d76 build(deps): bump actions/checkout from 5.0.0 to 6.0.0 (#12740)
Bumps [actions/checkout](https://github.com/actions/checkout) from 5.0.0 to 6.0.0.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](08c6903cd8...1af3b93b68)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: 6.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-24 02:46:36 -08:00
Max Gautier
9975b5d525 Remove download support for old calico versions (#12724)
we no longer deploy those versions
2025-11-20 04:56:01 -08:00
Max Gautier
9d06ce1a8d CI: enable unsafe_show_logs == true by default (#12702)
* CI: enable unsafe_show_logs == true by default

* Deduplicate defaults vars (unsafe_show_logs)
2025-11-19 23:10:00 -08:00
Ali Afsharzadeh
bce107ce3d Upgrade cilium from 1.18.3 to 1.18.4 (#12717)
Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
2025-11-18 19:51:59 -08:00
Kubernetes Prow Robot
7d7a42d931 Merge pull request #12723 from VannTen/molecule_var_in_inventory
Put molecule variables in molecule inventories
2025-11-18 19:47:58 -08:00
Max Gautier
5183679a89 crio: molecule: move variables to inventory
Fix download/file (which needs the variable to determine the correct
binaries)
2025-11-18 15:44:09 +01:00
Max Gautier
b4fe577203 gvisor: molecule: move variables to inventory
Fix download/file (which needs the variable to determine the correct
binaries)
2025-11-18 15:44:08 +01:00
Max Gautier
bde51ebddf youki: molecule: move variables to inventory
Fix download/file (which needs the variable to determine the correct
binaries)
2025-11-18 15:44:06 +01:00
Max Gautier
381426d6d5 cri-docker: molecule: move container_manager to inventory var 2025-11-18 15:44:05 +01:00
Ali Afsharzadeh
b3ee6d6b75 Adjust hubble export values for cilium 1.18 schema change (#12665)
Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
2025-11-18 00:07:37 -08:00
ChengHao Yang
7436d63faa Patch versions updates (#12678)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-11-17 05:01:39 -08:00
Max Gautier
6138c6a1a2 CI: use a dedicated disk for releases (#12692)
This should make 'no space left on device' problems easier to handle

Use /tmp/releases as local_release_dir CI created machine, while keeping
the same folder on the runner (needed for gitlab-ci runner pods)
2025-11-17 02:57:39 -08:00
Max Gautier
6115eba3c3 CI: label VirtualMachineInstance with PR id and pipeline ids (#12716)
Helps with CI debuggability
2025-11-17 02:21:39 -08:00
Kubernetes Prow Robot
1c008d79b1 Merge pull request #12714 from tico88612/feat/gateway-api-auto-bump
Feat: Gateway API auto bump
2025-11-16 06:27:37 -08:00
ChengHao Yang
b4bbec6772 Feat: Gateway API version always get latest
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-11-16 22:11:48 +08:00
ChengHao Yang
5c6ee4852a Bump: Gateway API to 1.4.0 and set latest version
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-11-16 21:53:59 +08:00
ChengHao Yang
8190f952c1 Feat: add Gateway API component hash update
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-11-16 21:49:43 +08:00
ChengHao Yang
3edc3d7a36 Style: components.py argocd indent fix
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-11-16 21:48:32 +08:00
peterw
2f3f1d7e65 crio: add option pull_progress_timeout (#12555) 2025-11-15 19:53:37 -08:00
Max Gautier
71c69ec12c CI: Try a full ssh connection on hosts instead of only checking the port (#12416)
* CI: Try a full ssh connection on hosts instead of only checking the port

If we only try the port, we can try to connect in the playbook which is
executed next even though the managed node has not yet completed it's
boot-up sequence ("System is booting up. Unprivileged users are not
permitted to log in yet. Please come back later. For technical details,
see pam_nologin(8).")

This does not account for python-less hosts, but we don't use those in
CI anyway (for now, at least).

* CI: Remove connection method override when creating VMs

This prevented wait_for_connection to work correctly by hijacking the
connection to localhost, thus bypassing the connection check.
2025-11-15 08:37:37 -08:00
R. P. Taylor
dab0947150 change kubectl_node_schedulable var (#12661) 2025-11-15 07:01:37 -08:00
Max Gautier
5488e7d805 Update pre-commit hooks (#12707) 2025-11-14 07:51:41 -08:00
Max Gautier
ca9873cfcb crictl: remove useless layer of include_tasks (#12656) 2025-11-14 06:57:39 -08:00
Bas
65f33c3ef0 Install the clone as collection in SemaphoreUI if airgapped. (#12660)
Signed-off-by: Bas Meijer <bas.meijer@enexis.nl>
2025-11-14 06:45:40 -08:00
Anurag Ojha
5eccf9ea6c fix(cilium):correct loadBalancer.mode rendering in values.yaml (#12701) 2025-11-14 06:39:38 -08:00
Max Gautier
db599b3475 Patch version updates (#12696) 2025-11-14 04:41:45 -08:00
Chris Ricker
47140083dc Update Calico apiserver RBAC for Kubernetes 1.33+ (#12654)
Add missing RBAC permissions for Calico apiserver to function correctly
with Kubernetes 1.33+

Changes:

1. Add K8s 1.33 ValidatingAdmissionPolicy resources to calico-webhook-reader
   - validatingadmissionpolicies
   - validatingadmissionpolicybindings

Kubernetes 1.33 introduced ValidatingAdmissionPolicy resources (KEP-3488)
that require explicit RBAC permissions. Without these changes, Calico
apiserver on k8s 1.33+ will not work and needless errors are logged
2025-11-14 00:23:38 -08:00
ChengHao Yang
2d179879a0 Bump Sonobuoy to 0.57.3 (#12673)
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-11-11 19:00:56 -08:00
Max Gautier
61b8e4ce84 Test the correct version when testing collection + upgrade (#12675)
If we don't rebuild the collection and remove the previous archive we'll
test the first built.
2025-11-11 18:56:56 -08:00
Max Gautier
97a3776d8e Remove etcd member by peerURLs (#12682)
* Remove etcd member by peerURLs

The way to obtain the IP of a particular member is convoluted and depend
on multiple variables. The match is also textual and it's not clear
against what we're matching

It's also broken for etcd member which are not also Kubernetes nodes,
because the "Lookup node IP in kubernetes" task will fail and abort the
play.

Instead, match against 'peerURLs', which does not need new variable, and
use json output.

* Add testcase for etcd removal on external etcd

* do not merge

* fixup! Remove etcd member by peerURLs

* fixup! Remove etcd member by peerURLs
2025-11-10 03:52:56 -08:00
Max Gautier
990695de7b Let containerd create storage / state dir (#12681)
Containerd manages by itself, so there is no need to override it and
change permissions.
2025-11-10 03:42:56 -08:00
dependabot[bot]
4059c699dc build(deps): bump octokit/graphql-action from 2.3.2 to 3.0.0 (#12680)
Bumps [octokit/graphql-action](https://github.com/octokit/graphql-action) from 2.3.2 to 3.0.0.
- [Release notes](https://github.com/octokit/graphql-action/releases)
- [Commits](8ad880e4d4...abaeca7ba4)

---
updated-dependencies:
- dependency-name: octokit/graphql-action
  dependency-version: 3.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-11-10 00:50:56 -08:00
xin053
e22ce15429 Update cinder-csi image tag for offline install (#12627) 2025-11-08 21:38:52 -08:00
Max Gautier
452d4e63e0 Fix the (upgrade/remove_node) + collection test cases (#12672)
The 'old' playbook and the collection use '-' and '_' as separator,
which breaks the logic in scripts/testcases_run.sh.

Add aliases using the old schemes to make the test work and avoid
breaking anything.

Both '-' and '_' variants will be deleted once we switch to supporting
collection only.
2025-11-07 07:22:55 -08:00
Kubernetes Prow Robot
d2a46b4ff8 Merge pull request #12671 from VannTen/prep/kubectl_apply_ssa
Make kubectl_apply_stdin available to other roles (+ SSA support)
2025-11-07 03:42:52 -08:00
Max Gautier
e090c9ee26 Factor kubectl_apply_stdin into separate "vars only" role
This is needed to make it available to other roles than kubernetes-apps
2025-11-07 09:34:57 +01:00
Max Gautier
0d6d3f5828 kubectl_apply_stdin SSA support 2025-11-07 09:34:29 +01:00
Max Gautier
b9662dbd86 cleanup: don't cleanup runc orphan binary on immutable distros (#12669) 2025-11-06 22:16:53 -08:00
Ali Afsharzadeh
f5a480fdc4 Upgrade cilium from 1.18.2 to 1.18.3 (#12649) 2025-11-06 21:42:52 -08:00
Albin Björk
5dce75d29b upcloud: updated terraform provider version (#12642) 2025-10-24 00:53:34 -07:00
Max Gautier
5acde6cfe2 Get conf checksum directly for localhost CP loadbalancer (#12632)
There is no need to stat the templated file, because the template module
already returns a checksum.
2025-10-23 22:57:36 -07:00
Meza
c6926eb2f9 fix(calico): Add missed rbac verb for hostendpoints (#12641)
Signed-off-by: Meza <meza-xyz@proton.me>
2025-10-23 09:29:34 -07:00
Meza
1930ab7ed6 [docs] Fix typos found in the docs (#12638)
Signed-off-by: Meza <meza-xyz@proton.me>
2025-10-22 20:22:38 -07:00
dependabot[bot]
3edc979384 build(deps): bump cryptography from 46.0.2 to 46.0.3 (#12635)
Bumps [cryptography](https://github.com/pyca/cryptography) from 46.0.2 to 46.0.3.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/46.0.2...46.0.3)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-version: 46.0.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-19 23:52:12 -07:00
Max Gautier
cde7b2b022 Remove leftover docs fragment about mitogen (#12630)
This was left behind from 1fb14b746 (docs: remove outdated mitogen
documentation. (#12619), 2025-10-14)
2025-10-17 08:44:43 -07:00
i-yasuda
0d88532f3d [kubernetes] Support kubernetes 1.34 (#12549)
* [kubernetes] Support kubernetes 1.34.0

Update hashes for kubernetes 1.34.0 except for cri-o

* [kubernetes] Support kubernetes 1.34.1

Update hashes for kubernetes 1.34.1

* [cri-o] Update cri-o to 1.34.1

---------

Co-authored-by: Takuya Murakami <tmurakam@tmurakam.org>
2025-10-17 01:56:42 -07:00
Goutham K
1fb14b7463 docs: remove outdated mitogen documentation. (#12619) 2025-10-14 05:39:38 -07:00
ChengHao Yang
a66d00a535 Releng: bump galaxy version 2.30.0 (#12622)
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-10-14 02:09:35 -07:00
ChengHao Yang
9991412b45 Docs: bump version to 2.29.0 (#12621)
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-10-14 01:29:36 -07:00
Mahendra Reddy
ee6a792ec0 feat: add support crio additional mounts (#12561)
removed default since it's already set in variables

fix pre commit issue in the pipeline
2025-10-13 18:15:32 -07:00
Max Gautier
fbf957ab5d Fix breakage when ignoring all kubeadm preflight errors (#12606)
kubeadm errors out if 'all' is specified with specific checks, so check
that case when we add hardcoded checks.

Add a test to catch regression.
2025-10-13 05:54:58 -07:00
dependabot[bot]
202a0f3461 build(deps): bump redhat-plumbers-in-action/advanced-issue-labeler (#12600)
Bumps [redhat-plumbers-in-action/advanced-issue-labeler](https://github.com/redhat-plumbers-in-action/advanced-issue-labeler) from 3.2.2 to 3.2.3.
- [Release notes](https://github.com/redhat-plumbers-in-action/advanced-issue-labeler/releases)
- [Commits](0db433d412...e38e6809c5)

---
updated-dependencies:
- dependency-name: redhat-plumbers-in-action/advanced-issue-labeler
  dependency-version: 3.2.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-09 11:53:00 -07:00
Arthur Outhenin-Chalandre
8c16c0f2b9 owner: remove myself from reviewers (#12594)
Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
2025-10-09 02:47:03 -07:00
Jan Breitkopf
deaabb694d fix missing directory when run with download_run_once (#12275) 2025-10-09 02:01:02 -07:00
Mahendra Reddy
e39e005306 bugfix: skip etcd cert extraction if cilium identity uses crd (#12565)
* bugfix: skip etcd cert extraction if cilium identity uses crd

* remove new line end of the file
2025-10-09 00:31:00 -07:00
Matthias Lohr
6d6633a905 show node name to be more clear which node is going to be upgraded (#12399)
* show node name to be more clear which node is going to be upgraded

* also show nodename when uncordoning
2025-10-09 00:19:07 -07:00
Mohamed Omar Zaian
fd7f39043b [ingress-nginx] upgrade to 1.13.3 (#12604) 2025-10-08 19:04:59 -07:00
Ali Afsharzadeh
f8e74aafb9 Fix cilium_policy_audit_mode variable (#12569)
Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
2025-10-07 09:15:02 -07:00
ChengHao Yang
aa255f8831 Patch versions updates (#12602)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-10-07 07:25:02 -07:00
Bas
9ded45f703 Documentation - hardening.md - etcd_deployment_type: host (#12520)
* Fix for #12447

Signed-off-by: Bas Meijer <bas.meijer@enexis.nl>

* Update hardening.md

Co-authored-by: spatterlight <81454789+spatterIight@users.noreply.github.com>

---------

Signed-off-by: Bas Meijer <bas.meijer@enexis.nl>
Co-authored-by: spatterlight <81454789+spatterIight@users.noreply.github.com>
2025-10-06 02:07:00 -07:00
Mahendra Reddy
270ff65992 fix crio restart while switching runtime (#12008)
fixed kubelet condition

CRI-O: fix for handling of container runtime switching

refactored kubelet start condition

stop/start kubelet and crio only when default runtime is changed

fixed condition for runtime_matches fact variable

fixed set facts for existing container runtime

added crio runtime switch variable

changed condition to use runtime switch variable

added comment for not-found for readers
2025-10-06 01:58:59 -07:00
dependabot[bot]
324e7f50c9 build(deps): bump cryptography from 46.0.1 to 46.0.2 (#12599)
Bumps [cryptography](https://github.com/pyca/cryptography) from 46.0.1 to 46.0.2.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/46.0.1...46.0.2)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-version: 46.0.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-10-06 01:47:00 -07:00
R. P. Taylor
055274937b Fix variable typos (#12595) 2025-10-06 01:28:58 -07:00
philipp-check24
b98ed6ddf8 Remove update flag from pip install in ansible docs (#12590) 2025-10-03 06:56:58 -07:00
Meza
05c3e2c87c Fix typo in CONTRIBUTING.md (#12592)
Signed-off-by: Meza <meza-xyz@proton.me>
2025-10-03 04:30:57 -07:00
Alessio Greggi
b0571ccbf9 docs(hardening): fix broken link (#12577)
Signed-off-by: Alessio Greggi <ale_grey_91@hotmail.it>
2025-09-29 21:10:16 -07:00
Ali Afsharzadeh
8b62a71f31 Upgrade cilium related images (#12568)
Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
2025-09-29 10:04:19 -07:00
JaeyungLee
411fdddaae fix(docs): update calico.md wrong image path (#12582) 2025-09-28 00:24:15 -07:00
Sassan torabkheslat
51a1f08624 reset: set v4/v6 default policies to ACCEPT and drop user chains (#12552) 2025-09-24 20:14:15 -07:00
dependabot[bot]
67632844cd build(deps): bump cryptography from 45.0.7 to 46.0.1 (#12567)
Bumps [cryptography](https://github.com/pyca/cryptography) from 45.0.7 to 46.0.1.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/45.0.7...46.0.1)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-version: 46.0.1
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-22 03:02:19 -07:00
Seena Fallah
13c70d3a58 coredns: set deploy replicas when dns autoscaler is disabled (#12387)
Allow setting deployment replicas through `coredns_replicas` when
`enable_dns_autoscaler` is set to false.

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
2025-09-20 03:50:14 -07:00
Ali Afsharzadeh
fae4e08f35 Upgrade cilium from 1.18.1 to 1.18.2 (#12559) 2025-09-18 23:56:12 -07:00
Takuya Murakami
1d91e47878 Fix: Fix calico_crds_archive checksum (#12564)
It looks like the checksum was changed due to Github's compress algorithm change.
See #12523 for details.
2025-09-18 23:14:11 -07:00
Ali Afsharzadeh
6b973d072c Upgrade haproxy load balancer from 3.1.7 to 3.2.4 (#12557)
Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
2025-09-17 01:18:12 -07:00
ChengHao Yang
a36912e2c4 Patch versions updates (#12553)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-09-15 12:58:09 -07:00
Max Gautier
8d7d9907a1 Rough rework of the CI setup documentation (#12094) 2025-09-15 03:24:11 -07:00
Takuya Murakami
643087fea5 Bump cni-plugin 1.4.1 -> 1.8.0 (#12551)
- Add 1.5, 1.6, 1.7 and 1.8 hashes
- Drop <1.3.0

Signed-off-by: Takuya Murakami <murakami_da@nec.com>
2025-09-14 05:32:08 -07:00
Ali Afsharzadeh
2955dfe69f Upgrade flannel from 0.26.7 to 0.27.3 (#12543) 2025-09-11 00:22:07 -07:00
Ali Afsharzadeh
0a35c624ad Upgrade local-path-provisioner from 0.0.24 to 0.0.32 (#12545)
* Upgrade local-path-provisioner from 0.0.24 to 0.0.32

Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>

* Remove local_path_provisioner_image_tag variable

Co-authored-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

---------

Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
Co-authored-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-09-10 04:25:57 -07:00
Ali Afsharzadeh
456a3dda09 Upgrade cilium from 1.17.7 to 1.18.1 (#12542)
Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
2025-09-09 19:47:59 -07:00
dependabot[bot]
efd30981f8 build(deps): bump actions/setup-python from 5 to 6 (#12539)
Bumps [actions/setup-python](https://github.com/actions/setup-python) from 5 to 6.
- [Release notes](https://github.com/actions/setup-python/releases)
- [Commits](https://github.com/actions/setup-python/compare/v5...v6)

---
updated-dependencies:
- dependency-name: actions/setup-python
  dependency-version: '6'
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-07 22:13:26 -07:00
dependabot[bot]
aabe063490 build(deps): bump cryptography from 45.0.6 to 45.0.7 (#12538)
Bumps [cryptography](https://github.com/pyca/cryptography) from 45.0.6 to 45.0.7.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/45.0.6...45.0.7)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-version: 45.0.7
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-09-07 21:21:27 -07:00
jaehanbyun
50c5f39a9d chore: add 'nftables' to kube_proxy_mode comment (#12522)
Signed-off-by: jaehanbyun <awbrg789@naver.com>
2025-09-02 00:57:15 -07:00
Takuya Murakami
8e401f94ea [calico] Add version 3.30.3 and make it default (#12523)
Signed-off-by: Takuya Murakami <murakami_da@nec.com>
2025-09-02 00:41:16 -07:00
Max Gautier
0b082ac2f4 Patch versions updates (#12518)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-09-01 20:05:12 -07:00
David Bidorff
fe7592dd0c fix: provide an option to ignore sysctl errors about unknown keys (#12514)
* fix: provide an option to ignore sysctl errors about unknown keys

* fix: rename sysctl_ignoreerrors and remove useless var definitions
2025-09-01 07:07:14 -07:00
Kim Hyunyoung, Abel
eb26449e80 fix: typo (#12517) 2025-09-01 03:07:12 -07:00
ujstor
4ab213bc44 feat: add containerd_extra_runtime_args for CRI runtime configuration (#12247)
Add support for injecting additional configuration options into the
  containerd CRI runtime plugin section via containerd_extra_runtime_args.
2025-09-01 02:57:12 -07:00
Kim Hyunyoung, Abel
66cab15498 fix: redeploy coredns and nodelocaldns when its config changed (#12401) 2025-09-01 00:23:11 -07:00
Max Gautier
c03c68e8c7 Do not suppress output during cert generation (#12479)
Makes debugging easier.
2025-08-28 19:43:09 -07:00
ERIK
72c983c41e Fix(system_packages): Avoid version comparison error on non-numeric versions (#12512)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2025-08-28 00:19:10 -07:00
vdveldet
a01e96e21a Introduced internal_facts.yml and adapt playbooks to use this (#12492) 2025-08-28 00:11:10 -07:00
vdveldet
e52e262e78 Making 28.3 the new docker default (#12509) 2025-08-27 19:53:09 -07:00
Max Gautier
84504d156f Fold kubernetes-apps/network_plugin into network_plugin (#12506)
For what I can see, there is no reason for the split, and it makes
things confusing.
2025-08-27 18:43:10 -07:00
Hyeonki Hong
56c830713e Fix SAN address collection from ansible_default_ipv{4,6} (#12413)
Signed-off-by: Hyeonki Hong <hhk7734@gmail.com>
2025-08-26 02:40:11 -07:00
Max Gautier
acdc338fa4 Patch versions updates (#12503)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-08-26 02:36:11 -07:00
Mahdad Ghasemian
72877d68ec Fix: render tcp and udp service ports as integers in Ingress NGINX templates (#12442) 2025-08-26 02:32:11 -07:00
Qasim Mehmood
0f158e4e28 feat: Upgrade multus cni from 4.1.0 to 4.2.2 (#12495) 2025-08-26 02:28:10 -07:00
Ali Afsharzadeh
7d79f17b12 Fix duplicate dict key warning in bootstrap_os task includes (#12488)
Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
2025-08-26 01:32:11 -07:00
wangsifei99
f973deb95f fix netcheck_etcd_image_tag (#12402)
Signed-off-by: wangsifei99 <wangsifei@kylinos.cn>
2025-08-25 22:49:06 -07:00
Ali Afsharzadeh
4a4201c84d Remove ara_default from callbacks_enabled (#12490)
The option ara_default was still present in ansible.cfg under callbacks_enabled.
This is a leftover from commit b9e9364 ("Remove ara support in CI") and should
have been removed together with the rest of the ara integration.

Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
2025-08-24 22:49:06 -07:00
Mohamed Omar Zaian
80e0ad0fac [feat] Update metrics server to v0.8.0 (#12493) 2025-08-22 21:07:05 -07:00
Ali Afsharzadeh
303dd1cbc1 Enable reserved variable name checks and fix violations (#12463)
* Enable reserved variable name checks and fix violations

Updated .ansible-lint configuration to skip only var-naming[pattern]
and var-naming[no-role-prefix] instead of skipping the entire var-naming rule.
This enables the check for reserved variable names.

Renamed variables that used reserved names to avoid conflicts.
Updated all references in tasks, variables, and templates.

Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>

* Rename namespace variable inside tasks instead of deleting it

Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>

* Change hosts variable to vm_hosts

Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>

* Use k8s_namespace instead of dashboard_namespace in dashboard.yml.j2 template

Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>

---------

Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
2025-08-21 00:47:07 -07:00
Kubernetes Prow Robot
eb4f6d73fb Merge pull request #12441 from tico88612/feat/crds-installation
Feat: add common_crds role and Prometheus Operator CRDs installation
2025-08-19 05:25:37 -07:00
ChengHao Yang
44f511814b Test: add prometheus operator crds install
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-08-19 18:45:33 +08:00
Alejandro Macedo
e2046749ac Fix: Change "empty" definition for PodSecurity Admission configuration (#12439)
Fixes a bug where `kube-apiserver` fails to start if the PodSecurity
configuration file doesn't have the `apiVersion` and `kind` keys.

Signed-off-by: Alejandro Macedo <alex.macedopereira@gmail.com>
2025-08-19 02:57:36 -07:00
Max Gautier
f832271f5c Directly list conntrack modules instead of using a variable (#12475)
The conntrack kernel modules have no reason to be something else than
those two options, so there is no reason to have a variable.
2025-08-18 09:05:13 -07:00
Elias Probst
dc9d3bf39d Fix when expr of conntrack module loading (#12458)
Retrying to load conntrack modules was bound to fail due to the way, the current `when` conditions were utilized.
It was based on the assumption, that in case of success, the registered variable would have an `rc` attribute with the value `0`.
Unfortunately, the `rc` attribute is only present in case of a failure, where it's value is >1.

The result of `community.general.modprobe` in case of success looks like this:
```
{
    "changed": false,
    "msg": "All items completed",
    "results": [
        {
            "ansible_loop_var": "item",
            "changed": false,
            "failed": false,
            "invocation": {
                "module_args": {
                    "name": "nf_conntrack",
                    "params": "",
                    "persistent": "present",
                    "state": "present"
                }
            },
            "item": "nf_conntrack",
            "name": "nf_conntrack",
            "params": "",
            "state": "present"
        }
    ],
    "skipped": false
}
```

While it looks like this in case of a failure:
```
{
    "changed": false,
    "failed": true,
    "msg": "One or more items failed",
    "results": [
        {
            "ansible_loop_var": "item",
            "attempts": 3,
            "changed": false,
            "failed": true,
            "invocation": {
                "module_args": {
                    "name": "nf_conntrack_doesnotexist",
                    "params": "",
                    "persistent": "present",
                    "state": "present"
                }
            },
            "item": "nf_conntrack_doesnotexist",
            "msg": "modprobe: FATAL: Module nf_conntrack_doesnotexist not found in directory /lib/modules/5.14.0-570.32.1.el9_6.x86_64\n",
            "name": "nf_conntrack_doesnotexist",
            "params": "",
            "rc": 1,
            "state": "present",
            "stderr": "modprobe: FATAL: Module nf_conntrack_doesnotexist not found in directory /lib/modules/5.14.0-570.32.1.el9_6.x86_64\n",
            "stderr_lines": [
                "modprobe: FATAL: Module nf_conntrack_doesnotexist not found in directory /lib/modules/5.14.0-570.32.1.el9_6.x86_64"
            ],
            "stdout": "",
            "stdout_lines": []
        }
    ],
    "skipped": false
}
```

By evaluating `failed` instead, this issue can be prevented.
See also:
- https://github.com/kubernetes-sigs/kubespray/issues/11340

Co-authored-by: Max Gautier <mg@max.gautier.name>
2025-08-18 08:17:10 -07:00
Ali Afsharzadeh
7d3e0d4fe5 Simplify group_by logic by moving conditional to when clause (#12469)
Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
2025-08-18 07:39:11 -07:00
ChengHao Yang
9dca520b33 Feat: add prometheus_operator_crds in common_crds
The Prometheus Operator CRDs are commonly used for monitoring and are
used by some CNIs (such as Cilium). Kubespray can be installed first,
and the subsequent installation of the operator can be handled by the
user (or later extensions).

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-08-18 22:13:15 +08:00
Ali Afsharzadeh
fa22f9e5ab Ensure apt cache is updated before dist-upgrade (#12465)
Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
2025-08-18 07:13:12 -07:00
Shaleen Bathla
082507cff2 kubelet: conditionalize staticPodPath location (#12433)
Add variable to set kubelet staticPodPath location.
It can be set to empty so that we can choose to disable it for some nodes.
STIG recommendation is to disable it.

Signed-off-by: Shaleen Bathla <shaleenbathla@gmail.com>
Co-authored-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-08-18 06:51:11 -07:00
ChengHao Yang
1e327b4747 Feat: add prometheus_operator_crds download item
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-08-18 21:14:06 +08:00
ChengHao Yang
3ece592b51 Refactor: add common_crds role & migrate gateway_api
Adding commonly used CRDs can be expanded

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-08-18 19:27:25 +08:00
dependabot[bot]
bae7278fa8 build(deps): bump actions/checkout from 4.2.2 to 5.0.0 (#12472)
Bumps [actions/checkout](https://github.com/actions/checkout) from 4.2.2 to 5.0.0.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](11bd71901b...08c6903cd8)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-version: 5.0.0
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-18 01:41:10 -07:00
ChengHao Yang
cf2332c38f Patch versions updates (#12461)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-08-17 20:21:08 -07:00
Ali Afsharzadeh
51764b208b Upgrade cilium from 1.17.3 to 1.17.7 (#12470)
Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
2025-08-16 10:53:07 -07:00
Andrew Peabody
936f9faeaf docs: update OS and firewall (#12464)
* docs: update OS and firewall

* Update setting-up-your-first-cluster.md
2025-08-15 17:17:06 -07:00
Ho Kim
707616178e feat: add support for custom kubeadm pull image repository (#12128)
Signed-off-by: Ho Kim <ho.kim@ulagbulag.io>
2025-08-13 18:03:06 -07:00
Kubernetes Prow Robot
155c1c1531 Merge pull request #12456 from tico88612/feat/debian13
Feat: Debian 13 Trixie support
2025-08-13 00:05:14 -07:00
ChengHao Yang
7f64758592 Fix: Debian 13 system_package not found software-properties-common
Debian Trixie recently removed the package `software-properties-common`,
add the condition not on Debian Trixie.

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-08-12 20:29:35 +08:00
ChengHao Yang
4e1205958f Docs: add Debian 13 in README.md
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-08-12 20:29:35 +08:00
ChengHao Yang
2081df24ec CI: add Debian 13 tests
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-08-12 20:29:35 +08:00
ChengHao Yang
7a72031d1e Add Debian 13 kubevirt image
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-08-12 20:29:35 +08:00
ChengHao Yang
622ed15532 Fix the typo in debian12-calico test
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-08-12 20:28:33 +08:00
Aman Shrivastava
b4d3be482f Make control plane health check retries configurable (#12452) 2025-08-11 02:23:07 -07:00
dependabot[bot]
92f57e0811 build(deps): bump cryptography from 45.0.5 to 45.0.6 (#12453)
Bumps [cryptography](https://github.com/pyca/cryptography) from 45.0.5 to 45.0.6.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/45.0.5...45.0.6)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-version: 45.0.6
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-08-10 22:57:04 -07:00
Clement Phu
6c147dfe3c Add cilium_extra_values to make use of any cilium values (#12375)
fix noqa
2025-08-08 20:29:43 -07:00
ChengHao Yang
502ba663c5 Fix: Convert -backports sources to archive.debian.org for bullseye and older (#12434)
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-08-06 20:26:45 -07:00
Aman Shrivastava
5e54fd4da3 add proxy_env to cilium install task for proxy support (#12417) 2025-07-30 00:34:27 -07:00
Ho Kim
f347c12145 feat: add support for coredns_affinity (#11994)
Signed-off-by: Ho Kim <ho.kim@ulagbulag.io>
Co-authored-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-07-27 20:18:27 -07:00
Xuhui Sun
95640819f5 🐛 fix missing cilium_enable_bgp_control_plane config (#12430) 2025-07-26 21:38:26 -07:00
Psycho Mantys
5b1334102b Remove --auth-anonymous if kube_api_anonymous_auth is undefined. (#12353)
Remove --auth-anonymous if kube_api_anonymous_auth in undefined, to avoid
compatibility errors with other arguments of the kube-apiserver, such as
--authentication-config when anonymous field is configured.
2025-07-26 20:20:27 -07:00
ak1ra
96c39ae7fd doc: modify section order, remove redundant section (#12423) 2025-07-25 01:52:38 -07:00
Romain Lalaut
d198b2ca53 playbooks/remove_node.yml: fixes localhost validation task (#12420)
- Add gather_facts: false (no system facts needed for validation)
- Add become: false (no privilege escalation needed on localhost)
2025-07-25 01:52:31 -07:00
dependabot[bot]
9e8bf18aa1 build(deps): bump distlib from 0.3.9 to 0.4.0 (#12415)
Bumps [distlib](https://github.com/pypa/distlib) from 0.3.9 to 0.4.0.
- [Release notes](https://github.com/pypa/distlib/releases)
- [Changelog](https://github.com/pypa/distlib/blob/master/CHANGES.rst)
- [Commits](https://github.com/pypa/distlib/compare/0.3.9...0.4.0)

---
updated-dependencies:
- dependency-name: distlib
  dependency-version: 0.4.0
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-23 04:48:26 -07:00
Kubernetes Prow Robot
fcaaee537e Merge pull request #12377 from yankay/bump-containerd
feature: support containerd static binary
2025-07-19 05:56:25 -07:00
Kay Yan
97946cfdb7 support containerd static binary
Co-authored-by: Max Gautier <mg@max.gautier.name>
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2025-07-18 04:20:58 +00:00
Kay Yan
72518b4497 bump containerd to 2.1.3, runc to 1.3.0,nerdctl to 2.1.2
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2025-07-18 04:19:35 +00:00
Chad Swenson
18d7a02280 Patch versions updates (#12410)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-07-17 20:28:24 -07:00
ChengHao Yang
8d275dcb4f Fix: nodelocaldns capabilities usage (#12398)
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-07-15 18:54:22 -07:00
vdveldet
ff2179985c Added new checksums for helm_archive_checksums (#12389)
* Added new checksums for helm_archive_checksums

* Changed default helm version
2025-07-15 01:00:23 -07:00
Romain Lalaut
b1cc016cc0 dd external_openstack_lbaas_member_subnet_id variable to external-openstack-cloud-config.j2 (#12267) 2025-07-13 23:58:24 -07:00
wangsifei99
263e8b24cf Fix#12385 cilium typo (#12393)
Signed-off-by: wangsifei99 <wangsifei@kylinos.cn>
2025-07-10 19:19:27 -07:00
mathgaming
ce2ba28dec Fixed syntax error in _bgp_config dict (#12258) 2025-07-10 18:43:27 -07:00
Takuya Murakami
784bf36c66 fix: Use crun in the cri-o distribution and don't use crun role from cri-o role anymore (#12289)
Signed-off-by: Takuya Murakami <murakami_da@nec.com>
2025-07-08 06:37:27 -07:00
Max Gautier
cbdfad8e80 CI: fix broken debugging (#12381) 2025-07-08 04:33:27 -07:00
pando85
d02910c675 Add header configuration in containerd hosts.toml (#12368)
* Add header configuration in containerd hosts.toml

Signed-off-by: Alexander Gil <pando855@gmail.com>

* Disable log output on containerd mirrors settings if required

Signed-off-by: Alexander Gil <pando855@gmail.com>

---------

Signed-off-by: Alexander Gil <pando855@gmail.com>
2025-07-07 23:41:27 -07:00
Chad Swenson
1e523a267c Fix kubeadm upgrade node skipPhases with multiple CP nodes (#12367)
Add 1.32 conditional defaults

Restore support for kubeadm upgrade node --skip-phases < 1.32, apply still needs to be restricted
2025-07-07 11:29:26 -07:00
Max Gautier
15c8a4768d Do not alter etc/hosts (#12382)
This is no longer needed, likely for a long time.
2025-07-07 04:53:26 -07:00
Elias Probst
6ca9f1f731 docs: Ansible Collection 404s (#12376)
* docs: remove obsolete reference to `gen_tags.sh`

`scripts/gen_tags.sh` was removed in 373b952a0c

* docs: fix 404 links

Merge the `Requirements` section with the `Usage` section and just
reference the inventory documentation, which then points to all further
information related to group vars etc.
2025-07-07 03:01:25 -07:00
wangsifei99
3311ceaa7b Fix kubespray reset shouldn't remove /etc/dnsmasq files (#12380)
Signed-off-by: wangsifei99 <2209856191@qq.com>
2025-07-07 00:25:25 -07:00
Max Gautier
6354aa686e Allow vagrant jobs to be triggered manually in Gitlab UI (#12349) 2025-07-06 23:57:25 -07:00
dependabot[bot]
90d5b34eca build(deps): bump cryptography from 45.0.4 to 45.0.5 (#12378)
Bumps [cryptography](https://github.com/pyca/cryptography) from 45.0.4 to 45.0.5.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/45.0.4...45.0.5)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-version: 45.0.5
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-07-06 23:49:26 -07:00
Kay Yan
7f6db0cbfa add rocky linux 10 image (#12379)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2025-07-06 23:45:26 -07:00
vdveldet
8d7cbe732e Adding proper quotation (#12371)
* Adding proper quotation

* Update file with correct quotes
2025-07-06 02:33:24 -07:00
Dexter
1e5a203ddc Vagrant: Ensure IP Subnet not in use by localhost (#12332)
* feat(subnet): Ensure Vagrant subnet not in use by localhost

This commit ensures that Vagrantfile supplied $subnet is not in use by
the localhost. Previously, if the subnet is in use by localhost (i.e.
bridge network), Vagrant VM boxes can not communicate.

* refactor(socket): Use ruby Socket library to find addrs

This commit reverts the usage of Ruby .scan() which may result in
failure if program is not provided. Instead, this commit refactors to
use Socket library to determine interfaces in use, then proceeds to
compare with Vagrantfile supplied subnets. Additionally, the commit
supports IPv6 comparisons.
2025-07-04 01:15:25 -07:00
Mustafa Mertcan Çam
cde6e815dd Cilium: Pass cluster DNS to hubble.peerService in values.yaml.j2 (#12346)
* cilium: pass cluster DNS to hubble.peerService in values.yaml.j2

* Add dedicated Hubble variable defaulting to inventory cluster domain
2025-07-03 09:37:25 -07:00
ERIK
c1c52002cf Remove unused Calico CNI pool variables (#12369)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2025-07-02 18:43:24 -07:00
Zied Kharrat
5cd3f40cbc Replace deprecated MAINTAINER with OCI-compliant LABEL (#12360) 2025-06-30 17:06:31 -07:00
Aman Shrivastava
f9385ec918 Add argocd_install component to hash update script with checksum entries (#12358) 2025-06-30 07:00:35 -07:00
ERIK
7ead3e2f11 fix(kubeadm): Conditionally add --skip-phases flag for v1.32.0+ (#12351)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2025-06-28 05:12:28 -07:00
Max Gautier
e0018268d6 CI: Add a test for scale.yml (#12285) 2025-06-28 01:16:29 -07:00
Kubernetes Prow Robot
d4cb5da017 Merge pull request #12295 from VannTen/ci/collection
CI: Simplify running playbooks as collection + various CI Fixes
2025-06-27 10:00:33 -07:00
刘旭
62f49822dd fix ETCD_INITIAL_CLUSTER config in etcd.env and etcd-events.env (#12342) 2025-06-27 07:44:29 -07:00
Romain Lalaut
878da9fb16 Argo CD : checksum support for the install url (#12266)
Fixes https://github.com/kubernetes-sigs/kubespray/issues/12223
2025-06-27 07:24:30 -07:00
Max Gautier
f55de03fa6 CI: update sonobuoy url (heptio is now part of VMware)
Suggested-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-27 14:32:21 +02:00
Max Gautier
7b6ff769f0 CI: 020_check_pods -> more readable output
Filter pod to describe / logs only the broken ones.
2025-06-27 14:32:20 +02:00
Max Gautier
e369ac2f24 CI: more readable loop
Avoids putting whole pod spec in loop label
2025-06-27 14:32:19 +02:00
Max Gautier
4a0a73b307 CI: fix check for kube_version 2025-06-27 14:15:11 +02:00
Max Gautier
253fc5ee59 CI: factorize tests into a single playbook
This allows to use kubespray_defaults (once) instead of redefining
defaults in the tests.
Test test files becomes imported tasks rather thand standalone
playbooks.
2025-06-27 14:15:11 +02:00
Max Gautier
bf41d3bfea CI: Simplify running playbooks as collection 2025-06-27 14:15:09 +02:00
Chad Swenson
ede92b0654 Fix calico etcd mode networkpolicy RBAC (#12344) 2025-06-27 04:50:29 -07:00
Takuya Murakami
048967e3b0 feat: Add cilium_install_extra_flags (#12262)
Enable to use --chart-directory options etc for offline installation

Signed-off-by: Takuya Murakami <murakami_da@nec.com>
2025-06-25 05:58:29 -07:00
Kim Hyunyoung, Abel
8cc5897d5c fix: add cilium extraConfig values (#12335) 2025-06-23 23:36:29 -07:00
ChengHao Yang
479e239016 CI: replace kaniko with buildkit (#12305)
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-23 00:08:53 -07:00
Chad Swenson
39e0fc64ba Patch versions updates (#12322)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-06-19 23:10:51 -07:00
Kubernetes Prow Robot
5ed7042808 Merge pull request #11924 from tico88612/bump/ansible-10.7.0
Bump Ansible to 10.7.0 & Deprecate Pre-installed Python 3.7-OS tests
2025-06-19 23:06:51 -07:00
ChengHao Yang
48cc0e1cde CI: add pip install in upgrade job
This will avoid check ansible version failed in upgrade job.

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-20 12:18:10 +08:00
ChengHao Yang
854dbef25e Docs: remove unused CI tests information
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-20 12:18:10 +08:00
ChengHao Yang
95998e437b CI: remove OpenSUSE 15.6 tests
Because pre-installed python version is 3.6, which is deprecated by
Ansible 10.

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-20 12:18:09 +08:00
ChengHao Yang
fc0206e313 CI: remove RHEL8-related OS tests
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-20 12:18:09 +08:00
ChengHao Yang
26acce9cec Docs: update ansible-core version to 2.17.3
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-20 12:18:09 +08:00
ChengHao Yang
d3c3ccd168 Update ansible_version minimal and maximal version
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-20 12:18:09 +08:00
ChengHao Yang
58e302ec31 Bump ansible to 10.7.0
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-20 12:18:09 +08:00
ChengHao Yang
3cda93405a Cleanup: Ubuntu 20.04 tests (#12301)
* Test: molecule replace ubuntu2004 with ubuntu2204 ubuntu2404

cri-dockerd, adduser and bastion-ssh-config can't run ubuntu2404, maybe needs to check login.

"System is booting up. Unprivileged users are not permitted to log in yet. Please come back later. For technical details, see pam_nologin(8)."

Signed-off-by: ChengHao Yang
<17496418+tico88612@users.noreply.github.com>

* Test: replace ubuntu-2004 with ubuntu-2404

All ubuntu-2004 tests are removed.

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Docs: update ci.md

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Docs: update README.md

Remove Ubuntu 20.04 support

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

---------

Signed-off-by: ChengHao Yang
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-19 18:54:51 -07:00
Chad Swenson
540cfd1087 Add version pinning for AWS tf provider to fix CI (#12323) 2025-06-19 18:38:51 -07:00
Kubernetes Prow Robot
f58315f69e Merge pull request #12254 from tico88612/fix/cilium-migration
Fix: the cluster is upgraded from 2.27 to 2.28 cilium will break
2025-06-19 18:34:51 -07:00
_xat_
dca2a5ecb3 Skip kube-proxy addon phase during kubeadm upgrade if disabled (#12306) 2025-06-18 03:48:52 -07:00
Max Gautier
85cf0014cd CI: Run vagrant validate on master as well (#12311)
Not really a reason not to, and this actually breaks daily-ci because
some jobs depends on this one so the whole pipeline is invalid if it's
not created.
2025-06-18 02:12:53 -07:00
Kubernetes Prow Robot
170b3dc55d Merge pull request #12302 from VannTen/ci/factorize_molecule_scenario
CI: cleanup and factorization of molecule tests
2025-06-17 10:23:00 -07:00
Max Gautier
50a32acf51 CI: use debug stdout callback everywhere (except pre-commit) 2025-06-17 14:56:16 +02:00
Max Gautier
b372a6f0f3 Fix alternatives runtimes CI
- youki and gvisor molecule tests are now passing
- kata-containers still broken
2025-06-17 14:56:15 +02:00
Max Gautier
5671037b0e Convert alternatives runtimes molecule to ansible verifier 2025-06-17 14:56:14 +02:00
Max Gautier
1ccb3a38a2 Convert cri-dockerd molecule to ansible verifier 2025-06-17 14:56:06 +02:00
Max Gautier
68c4ee23cb Convert CRI-O molecule to ansible verifier 2025-06-17 14:56:04 +02:00
Max Gautier
3f26203ed0 Convert containerd molecule to ansible verifier 2025-06-17 14:56:02 +02:00
Max Gautier
a5ede2a5c7 container-engine: factorize molecule testing infra 2025-06-17 14:56:00 +02:00
Max Gautier
69c4c90634 Factorize dynamic groups into a role 2025-06-17 14:55:59 +02:00
Max Gautier
06d8d48488 Add previous release to the auto update script. (#12312) 2025-06-16 22:53:00 -07:00
dependabot[bot]
9c621970ff build(deps): bump cryptography from 45.0.3 to 45.0.4 (#12317)
Bumps [cryptography](https://github.com/pyca/cryptography) from 45.0.3 to 45.0.4.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/45.0.3...45.0.4)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-version: 45.0.4
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-16 18:39:02 -07:00
dependabot[bot]
7bb9d57dc9 build(deps): bump redhat-plumbers-in-action/advanced-issue-labeler (#12318)
Bumps [redhat-plumbers-in-action/advanced-issue-labeler](https://github.com/redhat-plumbers-in-action/advanced-issue-labeler) from 3.2.1 to 3.2.2.
- [Release notes](https://github.com/redhat-plumbers-in-action/advanced-issue-labeler/releases)
- [Commits](39087a4b30...0db433d412)

---
updated-dependencies:
- dependency-name: redhat-plumbers-in-action/advanced-issue-labeler
  dependency-version: 3.2.2
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-15 20:38:58 -07:00
Max Gautier
f866fd76f8 Patch versions updates (#12313)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-06-15 20:20:57 -07:00
ChengHao Yang
fa880b6bcc Feat: add nftable mode in calico (#12255)
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-15 18:54:58 -07:00
Jay.H
6fc1abba2e fix offline prepare scripts (#11962)
fix offline prepare scripts
2025-06-15 07:28:58 -07:00
Jay.H
1abadd8caa fix manage-offline-container-images.sh get image_id (#11961) 2025-06-15 07:10:57 -07:00
Kubernetes Prow Robot
ad31de4220 Merge pull request #12132 from tico88612/fix/remove-anonymous-kubeadm-validation
Fix: kubeadm secondary use file discovery validation
2025-06-15 05:48:56 -07:00
Max Gautier
144742cbce Use last patch versions by default for etcd/crio/crictl (#12233)
This uses the same logic than the other versions, with simplications for
crictl and crio whose versionning scheme is tied to upstream kubernetes.

Also move some version variables in vars/ rather than defaults/, because
they are not used elsewhere and don't really make sense as modifiable by
the user.
2025-06-14 18:56:55 -07:00
ChengHao Yang
f77aea13e9 Cleanup: kubeadm-config v1beta4 extra args defined conditions (#12307)
* Cleanup: kubeadm-config v1beta4 extra args defined conditions

Some variables have already been defined, so there is no need to
useconditional statements to check whether they have been defined.

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Cleanup: cloud-provider extra args

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

---------

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-14 13:38:56 -07:00
ChengHao Yang
f810e80b6c Bump: external snapshot CRD to v0.15.0 (#12308)
Currently, there is no reliable way to obtain individual CRD files, so
the only solution is to update first.

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-14 13:34:56 -07:00
Chad Swenson
b04ceba89b Fix calico CNI timeouts in reset role (#12300)
* Fix an issue with CNI timeouts in reset role

* Consolidate secondary service removal tasks
2025-06-13 02:54:56 -07:00
Max Gautier
f6d29a27fc Remove stale TODOs (#12298)
Upstream consider it working as expected, won't fix
https://github.com/ansible-collections/community.general/issues/7717#issuecomment-2061880929
2025-06-12 20:14:57 -07:00
Kubernetes Prow Robot
28d23ffc3b Merge pull request #12236 from VannTen/cleanup/bootstap+packages
Cleanup of bootstrap and package installation
2025-06-12 07:24:56 -07:00
ChengHao Yang
ac0b0e7d6e Fix: upgrade cluster discovery kubeconfig not found
When installing or upgrading in the past, there was no validation
config. Check if the file exists first to prevent subsequent validation
errors.

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-12 10:05:59 +08:00
ChengHao Yang
e618d71f2a Fix: kubeadm secondary use file discovery validation
The validation step is moved to the end to avoid the loss of files that
may lead to verification failure.

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-12 10:05:58 +08:00
Kay Yan
cd82ac552b Add CI images for Fedora 41 and Fedora 42 (#12286)
* add CI image fedora-41 and fedora-42

Signed-off-by: Kay Yan <kay.yan@daocloud.io>

* Apply suggestions from code review

Co-authored-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

---------

Signed-off-by: Kay Yan <kay.yan@daocloud.io>
Co-authored-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-10 21:14:54 -07:00
ChengHao Yang
b981e2f740 Replace terraform with opentf (#12291)
Terraform is no longer open source software and has been removed and
replaced with OpenTofu.

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-10 06:44:25 -07:00
Kubernetes Prow Robot
739e5e1c6b Merge pull request #12199 from tmurakam/feature/kubernetes-1.33
[kubernetes] Support kubernetes 1.33
2025-06-05 20:20:38 -07:00
ChengHao Yang
1f9020f0b4 Fix: if cilium release exist, the action will set upgrade
`cilium install` is equivalent to `helm install`, it will failed if
cilium relase exist. `cilium version` can know the release exist without
helm binary

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-05 21:14:49 +08:00
ChengHao Yang
7bb9552e94 Fix: add cilium remove old resources option
Give users two options: besides skip Cilium, add
`cilium_remove_old_resources`, default is `false`, when set to `true`,
it will remove the content of the old version, but it will cause the
downtime, need to be careful to use.

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-05 21:14:49 +08:00
Slavi Pantaleev
d1bd610049 Fix indentation issue in Cilium values file and ensure booleans are lowercase (#12280)
This patch fixes the indentation in the `encryption` section.
Previously configuration like this:

```yml
cilium_encryption_enabled: true
cilium_encryption_type: wireguard
```

Would template to a `values.yaml` file with indentation that looks like this:

```yml
encryption:
  enabled: True
    type: wireguard
    nodeEncryption: False
```

instead of this:

```yml
encryption:
  enabled: true
  type: wireguard
  nodeEncryption: false
```

This syntax issue causes an error during Cilium installation.

This patch also makes all boolean values in this template file go through the `to_json` filter.
Since values like `True` and `False` are not compliant with the YAML v1.2 spec,
avoiding them is preferable.

`to_json` may be used for all other values in this template to ensure we end up with
a valid YAML document in all cases (even when various strings include special characters),
but this was left for another (future) patch.
2025-06-05 05:48:39 -07:00
Max Gautier
5243b33bd7 Cleanup support for removed OS in bootstrap
- centos < 8
- debian 10
2025-06-05 11:16:25 +02:00
Max Gautier
d5b2a9b5ba opensuse: move package installation to system_packages
No reason to special case
2025-06-05 11:16:24 +02:00
Max Gautier
2152022926 debian-based distro: handle apt update cache when installing packages
The package module pass options to the underlying packages manager
module if they support it. No need to handle it in bootstrap.
2025-06-05 11:16:24 +02:00
Max Gautier
f13b80cac0 ClearLinux: remove special casing
- put package install in system_packages
- docker should be handled by the approriate roles if used as container
  engine
2025-06-05 11:16:23 +02:00
Shuu
a87b86c6d3 Make main_ip cacheable in facts (#12243) 2025-06-05 01:58:38 -07:00
Kubernetes Prow Robot
d287420e8e Merge pull request #11868 from tico88612/test/flatcar-4081
Add Flatcar 4081 CI test
2025-06-05 01:08:43 -07:00
Peter Pan
85b0be144a Fix: check expiry before do breaking renew and container restart actions (#12194)
* Fix: check expiraty before renew

Since certificate renewal and container restarts involve higher risks,
they should be executed with extra caution.

* squash to Fix: check expiraty before renew

* squash to Fix: address more comments from VannTen

Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>

---------

Signed-off-by: Peter Pan <Peter.Pan@daocloud.io>
2025-06-05 01:04:41 -07:00
ChengHao Yang
6f7822d25c [flannel] upgrade to 0.26.7 (#12260)
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-05 00:42:53 -07:00
ChengHao Yang
b1fc870750 Add tico88612 as approver (#12281)
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-04 22:56:42 -07:00
dependabot[bot]
d0e9088976 build(deps): bump cryptography from 45.0.2 to 45.0.3 (#12259)
Bumps [cryptography](https://github.com/pyca/cryptography) from 45.0.2 to 45.0.3.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/45.0.2...45.0.3)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-version: 45.0.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-06-02 22:50:45 -07:00
Imran Ahmed
ce26f17e9e fix unquoted san cert causing issues with ips (#12256) 2025-06-02 22:50:38 -07:00
Christos Papageorgiou
a9f600ffa2 Import centos bootstrap os task for Alma/Rocky Linux (#12264) 2025-06-02 22:42:38 -07:00
ERIK
3454cd2c69 feat: Support certificate validity period config in kubeadm v1beta4 (#12272)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2025-06-02 20:44:37 -07:00
ChengHao Yang
0d5e18053e Test: remove bin_dir from other tasks move to common_vars.yml
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-29 12:43:38 +08:00
Max Gautier
2fbbf2e1e4 CI/kubevirt: Configure ignition provisioning
Flatcar does not support cloud-init
2025-05-27 23:29:56 +08:00
ant31
3597b8d7fe Kubevirt: use Ignition cloud config 2025-05-27 23:29:55 +08:00
ChengHao Yang
68d8f14f0d Update CI.md document
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-27 23:29:55 +08:00
ChengHao Yang
32675695d7 Add flatcar 4081 CI packet test
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-27 23:29:55 +08:00
Kubernetes Prow Robot
c7c3d2ba95 Merge pull request #12163 from VannTen/cleanup/etcd_inv_sample
Move etcd inventory sample doc to role defaults
2025-05-26 03:16:16 -07:00
Ali Afsharzadeh
c89c34f4d6 Update load balancers versions to Nginx 1.28.0, Haproxy 3.1.7 (#12178) 2025-05-23 20:50:34 -07:00
Max Gautier
92e8ac9de2 Remove tag 'master' (#12228)
* Remove tag master

Following it's deprecation in 4b324cb0f (Rename master to control plane
- non-breaking changes only (#11394), 2024-09-06)

* Add fail fast path when using removed tags

- Used for the master tag, but this could be used for other things in
  the future
2025-05-22 01:20:36 -07:00
Anshuman Agarwala
73b3e9b557 Removed weave support (#12230) 2025-05-22 01:10:36 -07:00
Max Gautier
b79f7d79f0 docs: remove obsolete cgroups variables (#12239)
Those variables are removed since 1bc61c9f3 (Simplify kubelet-config
template, 2023-11-23), removing them from docs as well.
2025-05-21 22:40:35 -07:00
Max Gautier
490dece3bf Cleanup assert after 2.28 (#12245)
Users should have used 2.28 and adapted their inventories now.
2025-05-21 20:28:35 -07:00
Takuya Murakami
c1e3f3120c CI: Use ubuntu-2204 for crio test 2025-05-22 08:59:52 +09:00
Takuya Murakami
16c05338d9 Update cri-o to 1.33.0 for kubernetes 1.33
Use ubuntu 22.04 for molecule test of cri-o,
because crun included in the cri-o does not work on
ubuntu 20.04.
2025-05-22 08:43:03 +09:00
Takuya Murakami
8ad1253b4f [kubernetes] Support kubernetes 1.33.1
- Add checksum entries.
- Set min required version to Kubernetes 1.31.x
- Update supported versions
- Refactor coredns_version
2025-05-21 23:56:47 +09:00
Takuya Murakami
cee065920f fix: The 'AppArmor' feature gate is removed from kubernetes 1.33
Signed-off-by: Takuya Murakami <murakami_da@nec.com>
2025-05-21 23:56:47 +09:00
ChengHao Yang
871941f663 Chore: upgrade galaxy.yml version (#12241)
* Chore: upgrade galaxy.yml version

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Docs: upgrade version to v2.28.0

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

---------

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-21 07:46:35 -07:00
Anshuman Agarwala
63cdf87915 Removed equinix provider (#12229) 2025-05-20 03:53:15 -07:00
Max Gautier
175babc4df Move some approvers to emeritus (#12156)
Thanks for you work !
2025-05-20 03:11:17 -07:00
Ekko
6c5c45b328 Allow stopping ubuntu unattended-upgrades (#12174)
Signed-off-by: Ekko Tu <lihai.tu@daocloud.io>
2025-05-20 01:07:16 -07:00
Kubernetes Prow Robot
019cf2ab42 Merge pull request #12101 from tico88612/refactor/cilium-install
Refactor Cilium CNI installation
2025-05-20 01:01:15 -07:00
dependabot[bot]
571e747689 build(deps): bump cryptography from 44.0.3 to 45.0.2 (#12235)
Bumps [cryptography](https://github.com/pyca/cryptography) from 44.0.3 to 45.0.2.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/44.0.3...45.0.2)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-version: 45.0.2
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-19 07:21:15 -07:00
ChengHao Yang
1266527014 Add cilium cli binary hash before 0.18.3
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:15 +08:00
ChengHao Yang
5e2e63ebe3 Make cilium dnsProxy transparent mode configure
When Cilium is configured to replace kube-proxy, it automatically
enables dnsProxy, which can conflict with nodelocaldns.
2025-05-19 08:48:15 +08:00
ChengHao Yang
db290ca686 Add cilium gateway api support
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:15 +08:00
ChengHao Yang
6619d98682 Add cilium hubble export dynamic content
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:15 +08:00
ChengHao Yang
b771d73fe0 Add cilium hubble export file max backups & size mb
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:15 +08:00
ChengHao Yang
65751e8193 Add cilium operator tolerations default values
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:15 +08:00
ChengHao Yang
4c16fc155f Cilium values k8sServiceHost and k8sServicePort use auto
Signed-off-by: ChengHao Yang
<17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:14 +08:00
ChengHao Yang
dcd3461bce Cilium values use image variables
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:14 +08:00
ChengHao Yang
48f75c2c2b Upgrade Cilium related images
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:14 +08:00
ChengHao Yang
a4b73c09a7 Upgrade cilium version to 1.17.3
Signed-off-by: ChengHao Yang
<17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:14 +08:00
ChengHao Yang
af62570110 Change cilium_kube_proxy_replacement to true for CI tests
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:14 +08:00
ChengHao Yang
bebba47eb4 Change kube_owner to root for cilium CI test
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:14 +08:00
ChengHao Yang
86437730de Use cilium-cli install Cilium
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:14 +08:00
ChengHao Yang
6fe64323db Remove old cilium templates install
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:13 +08:00
ChengHao Yang
1e471d5eeb Upgrade outdated cilium_min_version_required
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:11 +08:00
Max Gautier
3a2862ea19 Move checksums to kubespray_defaults/vars (#12234)
The checksums are not a defaults and are not meant to be changed from
the inventories.

Furthermore, role defaults have a lower priority that hosts facts, which
technically means a rogue hosts could hijack the hashes for its
variables.
2025-05-18 16:13:14 -07:00
Jay.H
8a4f4d13f7 fix manage-offline-container-images.sh create_registry (#11964) 2025-05-17 07:25:13 -07:00
ErmolenkoMaxim
46a0dc9a51 Add support for hubble-export-file-max-backups and max-size-mb variables (#12072)
* feat(cilium): add configurable Hubble export log rotation parameters

- Adds support for `cilium_hubble_export_file_max_backups` and `cilium_hubble_export_file_max_size_mb`
- Applies values only if `cilium_hubble_export_file_path` is defined
- Default values are set in role defaults
- Cleans up template logic by removing unnecessary conditionals

* Fix indentation for hubble export settings

* Fix undefined variable issue with ipwrap in kubeconfig override that caused pre-commit errors

* Update main.yml

rollback
2025-05-17 00:35:13 -07:00
Max Gautier
faae36086c Patch versions updates (#12226)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-05-16 14:13:14 -07:00
Max Gautier
9c2bdeec63 Decouple etcd defaults in a separate role
This allows us to reuse the defaults in other places without putting
everything in kubespray-defaults.

In that, for kubernetes/control-plane.
2025-05-16 14:51:29 +02:00
ERIK
e4c0c427a3 improve NTP package conflict handling (#12212)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2025-05-16 03:55:14 -07:00
Max Gautier
bca5a4ce3b CI: remove ci-not-authorized job (#12225)
This is now handled directly at the failfast-ci level (== integration
Github <-> Gitlab).
The whole pipeline will not be triggered unless:
- The author is a maintainer
- The PR has the /ok-to-test label
2025-05-16 03:27:13 -07:00
Antoine Legrand
5c07c6e6d3 Add option to [not] install coredns via Kubespray (#12218) 2025-05-16 03:23:13 -07:00
Takuya Murakami
c6dfe22a41 Improve logging of kubeadm init failure of first control plane node (#12216)
Split retry task of 'kubeadm init' to show the failure log of
the first execution.
2025-05-16 03:01:13 -07:00
Seena Fallah
ec85b7e2c9 download: respect enable_dns_autoscaler when enabling dnsautoscaler (#12217)
dnsautoscaler should only be enabled when enable_dns_autoscaler is
set to true. without this, it could be enabled without any manifest
actually using it, which makes it a false signal.

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
2025-05-15 12:45:13 -07:00
Kubernetes Prow Robot
acd6872c80 Merge pull request #12219 from VannTen/test/ha_etcd_separate
Fix broken workaround for separate etcd setup
2025-05-15 12:39:14 -07:00
Max Gautier
22d3cf9c2b Move 'pretend certificates' **after** cert distribution
The link target will only exist after we distribute the certs on each node.
2025-05-15 18:35:34 +02:00
Max Gautier
2d3bd8686f Add testcase separate ha-etcd
Also use a distinct node to test certificate distribution.
2025-05-15 18:20:13 +02:00
Hyeonki Hong
2c3b6c9199 feat: add trigger to restart kube-apiserver when config files change (#12172)
* feat: add trigger to restart kube-apiserver when config files change

* fix: remove not upgrade_cluster_setup condition

* refactor: streamline kube-apiserver restart notifications
2025-05-15 06:51:14 -07:00
Max Gautier
a55932e1de Patch versions updates (#12204)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-05-14 18:55:20 -07:00
Max Gautier
973bd2e520 Stop cleaning up containerd packages (#12213)
The switch to not use system packages for containerd packages happened
multiples releases ago ; there should not be any up-to-date installation
of kubespray needing that cleanup.

Remove those steps and variables only used by them.
2025-05-13 21:07:16 -07:00
Kubernetes Prow Robot
ea7331f5fc Merge pull request #12211 from VannTen/cleanup/rename_remove_node
rename-without-hypens: remove-node/pre-remove
2025-05-13 17:13:16 -07:00
Kubernetes Prow Robot
df241800ce Merge pull request #12203 from VannTen/cleanup/rename_bootstrap_os
Rename bootstrap-os to bootstrap_os
2025-05-13 05:03:16 -07:00
Cyclinder
8cc5694580 calico: update calico-kube-controller manifest (#12169) 2025-05-13 01:43:17 -07:00
Max Gautier
1d15baf405 Add compat and deprecation warning for boostrap-os 2025-05-13 09:39:59 +02:00
Max Gautier
47508d5c6e Rename bootstrap-os to bootstrap_os
Role names in ansible collections should not have hyphens.
2025-05-13 09:39:54 +02:00
Max Gautier
2a1ae14275 Compat layer remove-node/pre-remove 2025-05-12 22:22:20 +02:00
Max Gautier
e361def9cd Rename remove-node/pre-remove (no hypens for role in collection) 2025-05-12 22:19:50 +02:00
Max Gautier
fa6888df4c kubernetes_audit: Remove redundant defaults filter (#12208) 2025-05-12 07:23:14 -07:00
Max Gautier
373b952a0c Cleanup CI scripts (#12205)
* Delete unused scripts

- gen_tags.sh: not the right file, produce garbage even if path is fixed
- premoderator.sh: not used since ef6d24a49 (CI require a 'lgtm' or
  'ok-to-test' labels to pass (#11251), 2024-05-31)
- gitlab-branch-cleanup: unused AFAICT

* CI: inline molecule logs

Single use site -> less indirection makes it easier to read.
2025-05-12 05:53:15 -07:00
felipe88alves
9bbd597e20 create cilium_operator_tolerations variable in group_var (#12200)
- This enables ithe override of the tolerations for the cilium-operator deployment
 - default behaviour is to leave the toleration as is unless the var is set
2025-05-12 03:25:15 -07:00
Cheolhui Kim
fceb1516b8 Update: add Cilium LB IP Pool configuration to support ranges (#12140) 2025-05-12 01:39:18 -07:00
Kubernetes Prow Robot
43e19ab281 Merge pull request #12202 from VannTen/cleanup/rename_kubespray_defaults
Rename kubespray-defaults to kubespray_defaults
2025-05-12 01:21:14 -07:00
Max Gautier
4052cd5237 Add compat and deprecation warning for kubespray-defaults 2025-05-12 09:46:07 +02:00
Kim Hyunyoung, Abel
e1be469995 fix: do not mount hubble-ui tls volume when cilium_hubble_tls_generate is false (#12143) 2025-05-11 20:27:14 -07:00
Max Gautier
23d8c9a820 CI: enabled all jobs on daily CI (#12207) 2025-05-11 19:51:14 -07:00
Max Gautier
e618421697 Don't run upgrade-patch jobs on forks (#12206)
With the current github-workflow setup, workflows are triggered on every
forked repository (which is quite wasteful).

Add a condition to only run on the main repository.
2025-05-10 06:15:14 -07:00
Max Gautier
7db2aa1cba Rename kubespray-defaults to kubespray_defaults
Role names in ansible collection should not contains hyphens.
2025-05-10 10:04:37 +02:00
Kubernetes Prow Robot
0c8dfb8e43 Merge pull request #12185 from VannTen/cleanup/iproute_with_the_rest
Move package installation to bootstrap-os
2025-05-09 20:49:14 -07:00
Max Gautier
25e4fa17a8 Split kubespray-defaults (-> network_facts)
kubespray-defaults currently does two things:
- records a number of default variable values (in particular values used
  in several places)
- gather and compose some complex network facts (in particular,
  `fallback_ip` and `no_proxy`

There is no actual reason to couple those two things, and it makes using
defaults more difficult (because computing the network facts is somewhat
expensive, we don't want to do it willy-nilly)

Split the two and adjust import paths as needed.
2025-05-09 21:14:26 +02:00
Max Gautier
bb4b2af02e Drop install of python-libselinux for RHEL family below 8
RHEL 7 and derivates support has been removed from some time, clean up
of leftovers.
2025-05-09 21:14:25 +02:00
ChengHao Yang
27e93ee9f6 Feat: Gateway API early installation (#12189)
The Gateway API needs to be installed first if you want to use Cilium's
Gateway API functionality. The Gateway API is just CRD without any Pod,
Deployment, etc., so I think it can be brought forward to before the CNI
installation.

Signed-off-by: ChengHao Yang
2025-05-09 09:47:14 -07:00
dependabot[bot]
65bcddb9fd build(deps): bump cryptography from 44.0.2 to 44.0.3 (#12190)
Bumps [cryptography](https://github.com/pyca/cryptography) from 44.0.2 to 44.0.3.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/44.0.2...44.0.3)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-version: 44.0.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-09 01:53:14 -07:00
Chad Swenson
76707073c4 Fix indentation on AuthorizationConfiguration task (#12197) 2025-05-09 00:05:19 -07:00
Bas
a104fb6a00 kubedns_version no longer used (#12201)
This variable is documented, but not found in the rest of the sources.
2025-05-09 00:01:14 -07:00
ERIK
1c4b18b089 fix: arm64 checksums for youki and kata-containers (#12173)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2025-05-08 19:05:14 -07:00
Max Gautier
d6d87e9a83 Move cilium_deploy_additionnaly to kubespray-default (#12191)
Instead of using default(false) all over the place, use
kubespray-defaults
2025-05-07 05:05:17 -07:00
Max Gautier
985e4ebb23 Remove versions from inventory sample (#12164)
The recommended usage of kubespray is to use the default versions.
So putting them in inventory/sample is not really very helpful, and
causes:
- churn (keeping the inventory/sample up to date)
- support issues (mismatch between defaults and sample inventory)

Remove all concrete versions from the inventory sample.
2025-05-06 08:43:14 -07:00
Max Gautier
fcc294600c Workaround missing etcd certds on control plane node (#12181) 2025-05-05 01:05:57 -07:00
Max Gautier
9631b5fd44 Move etcd inventory sample doc to role defaults 2025-05-04 21:24:26 +02:00
Max Gautier
a7d681abff Install iputils with other packages 2025-05-04 21:22:49 +02:00
Max Gautier
5867fa1b9f Move back iproute install to system_packages
Packages are now installed before network facts collection, so we can
install iproute with the rest.
2025-05-04 21:22:49 +02:00
Max Gautier
1e79c7b3cb Move package install to bootstrap-os 2025-05-04 21:22:48 +02:00
Max Gautier
34d64d4d04 Remove outdated comment
bootstrap-os does not do anything in sudoers since e2ad6aad5 (bootstrap:
rework role (#4045), 2019-02-11).

So SSH pipelining working is effectively a pre-requisite anyway.
2025-05-04 21:22:48 +02:00
Max Gautier
87726faab4 Move check 'sorted pkgs list to pre-commit'
This is a lint check, which should not live in the playbook itself.
2025-05-04 21:22:47 +02:00
Max Gautier
1b9919547a Split 'offline' assert into their own role
The preinstall assert cover a number of things, many of which depends
only on the inventory, and can be run without any ansible_facts
collected.

Split them off to simplify re-ordering.
2025-05-04 21:22:46 +02:00
Kubernetes Prow Robot
84d96d5195 Merge pull request #12165 from tico88612/fix/failing-test-coredns-autoscaler
Feat: add `dns_autoscaler_affinity` and remove in-place values
2025-05-03 13:17:55 -07:00
ChengHao Yang
1374a97787 Test: ubuntu22-calico-all-in-one-upgrade disable dns autoscaler
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-04 00:23:05 +08:00
bin.pan
6f0fc020e8 update containerd.options key name (#12170) 2025-05-02 23:27:55 -07:00
Takuya Murakami
f58a6e2057 docs: Fix offline-environment.md to add 'v' prefix of some versions (#12166)
* docs: Fix offline-environment.md to add 'v' prefix of some versions

Now some version variables (kube_version, etcd_version, etc) don't have 'v' prefix,
so you need to add 'v' prefix to download URLs.

* fix: Fix offline.yml to add 'v' prefix of some versions
2025-05-02 01:57:55 -07:00
Ali Afsharzadeh
09fad4886a Fix path to facts.yml in node facts refresh section (#12177) 2025-05-02 00:39:56 -07:00
Ho Kim
c47711c2f2 fix: correct indent of cpuManagerPolicyOptions (#12123) 2025-05-02 00:27:56 -07:00
Karthik S
a3e6e66204 Etcd Certificates are not generated when adding nodes to an existing cluster with scale.yml (#12120)
* [Issue-12117]-Certificates for the new hosts are not generated during scale.yml

* [Issue-12117]-Certificates for the new hosts are not generated during scale.yml

* [Issue-12117]-Certificates for the new hosts are not generated during scale.yml
2025-05-02 00:03:56 -07:00
ChengHao Yang
2907936c85 Feat: add dns_autoscaler_affinity remove in-place values
Upstream has removed affinity, and fix upgrade failing test.

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-04-28 19:18:19 +08:00
ChengHao Yang
71a323039f Fix: kubelet-csr-approver moves to regular application installation (#12141)
This commit fixed the process to ensure that CCM is installed first to
avoid the chicken-and-egg problem.

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-04-28 01:27:26 -07:00
ChengHao Yang
5e5e509698 Revert "Update cluster-proportional-autoscaler to v1.9.0 (#11982)" (#12168)
This reverts commit 16841a1fb0.
2025-04-28 01:23:32 -07:00
Takuya Murakami
4a598c1ef3 Make kubernetes 1.32.4 default (#12161) 2025-04-25 01:22:30 -07:00
Aviral Agarwal
1da9f0dec4 Fixed kube-vip to use kube-vip/kube-vip-iptables image instead of kube-vip/kube-vip when lb_fwdmethod or kube_vip_lb_fwdmethod is set to masquerade (#12145) 2025-04-24 15:54:30 -07:00
ShinyaIshitobi
629a690886 fix: Enable NRI for containerd and disable plugin when nri_enabled is false (#12152)
* fix(containerd): always render NRI plugin block with conditional disable flag

* feat: enable Node Resource Interface plugin when using containerd

* fix: remove the

* fix: fix for linter
2025-04-24 01:40:33 -07:00
Mathieu Parent
16841a1fb0 Update cluster-proportional-autoscaler to v1.9.0 (#11982) 2025-04-24 01:32:37 -07:00
ERIK
22c19a40fa feat: Update containerd and nerdctl checksums to latest versions (#12154)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2025-04-24 01:02:31 -07:00
ERIK
8f41a2886d Update version comparison syntax and optimize whitespace (#12146)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2025-04-24 00:56:31 -07:00
Max Gautier
38cea5b866 Patch versions updates (#12119)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-04-23 21:48:30 -07:00
Ekko
4177289ef6 Fix typo in .gitlab-ci/kubevirt.yml (#12134)
Signed-off-by: Ekko Tu <lihai.tu@daocloud.io>
2025-04-18 03:59:06 -07:00
Kubernetes Prow Robot
4ad9f9b535 Merge pull request #11763 from tico88612/feat/gateway-api-v1.2.1
Refactor Gateway API installation process and bump Gateway API v1.2.1
2025-04-11 08:38:42 -07:00
ChengHao Yang
6f58b33de0 Deprecate gateway_api_experimental_channel
Please use `gateway_api_channel` and set `experimental`.

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-04-11 23:04:01 +08:00
ChengHao Yang
9456e792f1 Remove unused Gateway API template
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-04-11 22:57:00 +08:00
ChengHao Yang
7f60dda565 Refactor Gateway API manifests installation process
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-04-11 22:57:00 +08:00
ChengHao Yang
582fe2cbde Add Gateway API download information in kubespray-default
Remove old variables in kubernetes-apps/gateway_api

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-04-11 22:57:00 +08:00
Max Gautier
79fbfdf271 component_hash_update: support calico_crds (#12122)
- add support for "no_arch" downloads: arch-indendendant files such as
  YAML manifests, helm charts, etc.
- wire calico_crds with it.
2025-04-10 02:18:47 -07:00
ChengHao Yang
cfaf397d4a Bump: OpenStack Cloud Controller Manager upgrade to v1.32.0 (#12121)
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-04-10 01:44:41 -07:00
Kubernetes Prow Robot
2f404de77c Merge pull request #12037 from VannTen/ci/convert_vagrant_to_kubevirt_2
CI: convert remaining vagrant jobs (except IPv6) to kubevirt + cleanups
2025-04-09 01:16:42 -07:00
Mohammd Reza Mollasalehi
d304966d75 doc: fix a broken link in the Calico documentation (#12108) (#12109) 2025-04-08 06:32:46 -07:00
ChengHao Yang
4ce5510c1a [rbd-provisioner] deprecate outdated application and documentation (#12114)
* Cleanup: deprecate rbd-provisioner application

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Docs: remove rbd-provisioner application

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

---------

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-04-08 06:22:44 -07:00
ChengHao Yang
8032b8281d [cephfs-provisioner] deprecate outdated application and documentation (#12113)
* Cleanup: deprecated CephFS application

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Docs: Remove CephFS Application

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

---------

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-04-08 03:08:39 -07:00
Farshad Asadpour
45ecceb3e1 docs(terraform): update command for destroying infrastructure in README (#12111) 2025-04-08 02:16:39 -07:00
Max Gautier
5a6ef1dafa Timeout on RHEL subscription check (#12115)
subscription-manager status can in some circumstances just never
terminates, with nothing indicating the problem from the Ansible
playbook log.
This makes it difficult to find the hosts misbehaving.

Add a timeout to the subscription checks (defaulting to 3 minutes). This
should be more than enough for normal circumstances while allowing
easier troubleshooting, as the hosts will be FAILED instead of the
playbook just waiting indefinitely.
2025-04-08 01:24:44 -07:00
Max Gautier
0ae9ab36ce CI: Pin github actions for security (#12105)
Dependabot can still upgrade the action version.
2025-04-03 06:22:38 -07:00
Bas
cf48915657 Documenting offline installation with secure files repo and registry. (#11993)
* Add config for addon helm and local_path_provisioner

* Documenting offline installation with secure files_repo

* Documenting offline installation with secure registry
2025-04-03 02:06:37 -07:00
Fredrik Liv
6f74ef17f7 Upcloud: Add possibility to setup cluster using nodes with no public IPs (#11696)
* terraform upcloud: Added possibility to set up nodes with only private IPs

* terraform upcloud: add support for gateway in private zone

* terraform upcloud: split LB proxy protocol config per backend

* terraform upcloud: fix flexible plans

* terraform upcloud: Removed overview of cluster setup

---------

Co-authored-by: davidumea <david.andersson@elastisys.com>
2025-04-01 07:58:42 -07:00
Max Gautier
fe2ab898b8 component_hash_update: remove obsolete todos (#12098) 2025-03-31 15:18:35 -07:00
dependabot[bot]
c8b8567781 build(deps): bump actions/checkout from 3 to 4 (#12089)
Bumps [actions/checkout](https://github.com/actions/checkout) from 3 to 4.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v3...v4)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-31 01:40:34 -07:00
dependabot[bot]
bf86c14d35 build(deps): bump redhat-plumbers-in-action/advanced-issue-labeler (#12090)
Bumps [redhat-plumbers-in-action/advanced-issue-labeler](https://github.com/redhat-plumbers-in-action/advanced-issue-labeler) from 2 to 3.
- [Release notes](https://github.com/redhat-plumbers-in-action/advanced-issue-labeler/releases)
- [Commits](https://github.com/redhat-plumbers-in-action/advanced-issue-labeler/compare/v2...v3)

---
updated-dependencies:
- dependency-name: redhat-plumbers-in-action/advanced-issue-labeler
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-31 01:14:35 -07:00
dependabot[bot]
e47eb4bc7f build(deps): bump pytest-testinfra from 10.1.1 to 10.2.2 (#12096)
Bumps [pytest-testinfra](https://github.com/pytest-dev/pytest-testinfra) from 10.1.1 to 10.2.2.
- [Release notes](https://github.com/pytest-dev/pytest-testinfra/releases)
- [Changelog](https://github.com/pytest-dev/pytest-testinfra/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest-testinfra/compare/10.1.1...10.2.2)

---
updated-dependencies:
- dependency-name: pytest-testinfra
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-31 01:10:35 -07:00
Max Gautier
5222f48978 auto-update: use a branch prefix rather than suffix (#12097)
This is more in-line with dependabot and similar auto-updaters.

Reduce ci coverage on github action updating (it does not change
kubespray code, no need for testing).
2025-03-31 01:04:36 -07:00
Max Gautier
7b6b7318b2 Remove unused manifest (docs) (#12092)
This file is no longer referenced since e0d67367e (Update installation
doc with vagrant (#8406), 2022-01-11).
2025-03-29 11:26:34 -07:00
Kubernetes Prow Robot
f02d313fee Merge pull request #12093 from VannTen/cleanup/contrib
Cleanup old things in contrib/
2025-03-29 10:16:34 -07:00
Max Gautier
7c9870d15b Remove contrib/mitogen
- the playbook does not work
- the mitogen version is not up to date

This strongly suggests this is not used ; let's drop it.
2025-03-28 09:49:28 +01:00
Max Gautier
c8ea1468d1 Remove unmaintained contrib: kvm-setup 2025-03-28 09:39:30 +01:00
Max Gautier
ad26fe77f5 [contrib] Remove heketi and glusterfs (#12091)
* Remove heketi

Heketi is no longer developed or supported and should not be used
anymore.
Remove the contrib playbook.

* Remove contrib glusterfs

Glusterfs integration with glusterfs is now either deprecated or
unsupported.

Other storage solutions should be preferred.
2025-03-27 15:32:38 -07:00
Max Gautier
aae53a9df5 Add github workflow to auto update hashes daily (#12053) 2025-03-27 14:10:34 -07:00
Farshad Asadpour
1513254622 fix(remove-node): Ensure safety and validation for node removal process (#12085)
This commit enhances the node removal playbook's reliability and safety by implementing the following changes:

1. **Node Validation**: Added a validation step using assert to ensure the `node` variable is defined and contains nodes. If the list is empty or undefined, the playbook fails early, preventing accidental operations on the entire cluster.

2. **Removed Defaulting for Hosts**: Updated tasks to enforce explicit `node` variable input without defaulting to critical groups (e.g., `etcd:k8s_cluster:calico_rr`). By validating `node` beforehand, tasks now solely rely on user-provided input and safely avoid unintended targeting.

3. **Explicit User Confirmation**: Enhanced the confirmation prompt to clarify the scope of the operation. The admin is now required to explicitly confirm node state deletion, ensuring a deliberate decision before proceeding.

These improvements strengthen the reliability and safety of the `remove-node.yml` playbook by eliminating ambiguous behavior, preventing misconfigurations, and ensuring clear interaction during node removal tasks.
2025-03-27 06:10:34 -07:00
Ricky Kwan
4a5b524b98 Ensure metrics port exists for nodelocaldns/nodelocaldns-second daemonsets (#11998)
- update metrics port to use port variable
- unconditionally define ports
2025-03-27 04:14:34 -07:00
Max Gautier
aa0c0851f8 Upgrade kube-router (#12066)
- This happens to fix the fact that kube-router is broken when using the
  service proxy: https://github.com/cloudnativelabs/kube-router/issues/1558
2025-03-26 17:04:38 -07:00
Max Gautier
0fc56ed344 CI: fix terraform
- add default testcase
- fix ansible ssh connection
2025-03-26 20:05:26 +01:00
Max Gautier
5c4e597987 CI: workaround build: disable rebase 2025-03-26 20:05:25 +01:00
Max Gautier
ef133fd93d CI: cleanups leftovers things
include_vars is redundant as the file is already included by extra_vars
2025-03-26 20:05:25 +01:00
Max Gautier
f6ca3bf477 CI: simplify image build job 2025-03-26 20:05:24 +01:00
Max Gautier
b9e251ac7a CI: cleanup terraform + deduplicate and simplify 2025-03-26 20:05:23 +01:00
Max Gautier
43fceebdd3 CI: convert vagrant jobs to kubevirt
Vagrant jobs needs a big cache which makes them slow / sometimes stuck
completely. Using the kubevirt provisionning playbook is now
significantly faster, so do just that.

Having only one provisionner in CI will also allows us to remove some of
the custom runners executors we use for vagrant, and more generally
reduce the CI maintenance.

Our kubevirt CI platform does not support ivp6 yet, so we keep the
relevant jobs in vagrant, but we'll migrate them as well as soon as
possible.
2025-03-26 20:05:21 +01:00
Max Gautier
862aec4dc6 CI: remove 'packet' from jobs name + rename to kubevirt
This is more accurate, the name 'packet' being an aterfact of history
(the Kubevirt jobs used to run on Packet, the previous name of Equinix)
2025-03-26 14:32:26 +01:00
Max Gautier
4f3b214ef5 CI: streamline packet jobs definition
- Take advantage of `parallel:matrix` to make the jobs definition shorter
  and more readable.
- Remove helper scripts which are no longer needed
- Remove redundant indirection in the gitlab-ci pipelines definitions
  (only one user)
2025-03-26 14:32:24 +01:00
ERIK
a4843eaf5e fix: missing 'v' prefix in offline image tags (#12086)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2025-03-26 06:26:34 -07:00
Kay Yan
0f9f9fb569 support kube-proxy nftables (#12060)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2025-03-26 01:32:33 -07:00
Farshad Asadpour
e7c70d6169 fix(ingress-nginx): Upgrade ingress-nginx to v1.12.1 and webhook certgen image to v1.5.2 (#12075)
This commit upgrades ingress-nginx to version v1.12.1, addressing multiple critical vulnerabilities including CVE-2025-1974, CVE-2025-1097, CVE-2025-1098, CVE-2025-24513, and CVE-2025-24514 as detailed in the ingress-nginx release notes: https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.12.1

Important Notes:
- Fixing CVE-2025-1974 required disabling validation of the generated NGINX configuration during validation of Ingress resources. Invalid Ingress resources may stop the NGINX configuration from being updated.
- Recommended mitigations include enabling annotation validation and disabling snippet annotations.

Alongside this upgrade, the `ingress_nginx_kube_webhook_certgen_image_tag` has been updated to v1.5.2 for compatibility, based on: https://github.com/kubernetes/ingress-nginx/pull/13066

Changelog:
- Updated ingress-nginx version to v1.12.1 in Kubespray.
- Updated `ingress_nginx_kube_webhook_certgen_image_tag` in `roles/kubespray-defaults/defaults/main/download.yml` to v1.5.2.

Fixes: https://github.com/kubernetes-sigs/kubespray/issues/12073
2025-03-25 09:10:38 -07:00
Max Gautier
cd9c21b7d6 CI: fix wrong git config env variables (#12079) 2025-03-25 05:48:39 -07:00
ChengHao Yang
36cd894d58 Bump Docker default version to 28.0 (#12070)
* Cleanup: unsupport docker version

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Cleanup: unsupport OS rhel7

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Feat: upgrade docker package

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Docs: update rhel docker link

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Feat: upgrade docker version

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Docs: update docker version

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

---------

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-03-25 04:38:32 -07:00
Kay Yan
b38312d007 Upgrade CI for openSuse 15.6 (#12074)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2025-03-25 03:52:33 -07:00
Ekko
f3682d85d3 Update calico crd archive checksum (#12077)
Signed-off-by: Ekko <lihai.tu@daocloud.io>
2025-03-25 03:10:32 -07:00
Takuya Ono
3ace8963b1 docs: fix path in vagrant.md example (#12067) 2025-03-24 08:18:34 -07:00
Chad Swenson
b551fe083d Refactor control plane upgrades with reconfiguration support (#12015)
* Refactor control plane upgrades with reconfiguration support

Adds revised support for:
- The previously removed `--config` argument for `kubeadm upgrade apply`
- Changes to `ClusterConfiguration` as part of the `upgrade-cluster.yml` playbook lifecycle
- kubeadm-config `v1beta4` `UpgradeConfiguration` for the `kubeadm upgrade apply` command: [UpgradeConfiguration v1beta4](https://kubernetes.io/docs/reference/config-api/kubeadm-config.v1beta4/#kubeadm-k8s-io-v1beta4-UpgradeConfiguration).

* Add kubeadm upgrade node support

Per discussion:
- Use `kubeadm upgrade node` on secondary control plane upgrades
- Add support for UpgradeConfiguration.node in kubeadm-config.v1beta4
- Remove redundant `allowRCUpgrades` config
- Revert from `block` for first and secondary control plane back to unblocked tasks since they no longer share much code and it's more readable this way

* Add kubelet and kube-proxy reconfiguration to upgrades

* Fix task to use `kubeadm init phase etcd local`

* Rebase with changes from "Adapt checksums and versions to new hashes updater" PR

* Add `imagePullPolicy` and `imagePullSerial` to kubeadm-config v1beta4 `InitConfiguration.nodeRegistration`
2025-03-21 10:28:31 -07:00
diguage
ae583e2a28 registry support Mac (#11960) 2025-03-21 09:04:39 -07:00
lucianvlad
247b45bca6 fix issue with floating-ip (#11950) 2025-03-21 09:04:32 -07:00
Mathieu Parent
c20388dbbb fix: ensure CoreDNS is kept disabled on kubeadm upgrade (#12028)
See https://github.com/kubernetes/kubernetes/pull/129429
in kubernetes 1.32.1
2025-03-21 09:00:31 -07:00
Chad Swenson
d5a5e6a93c Ensure correct AuthorizationConfiguration API version during upgrades (#12058)
* Ensure correct `AuthorizationConfiguration` API version during upgrades

Fixes an issue where the wrong AuthorizationConfiguration API version could be used by kube-apiserver prematurely during upgrades.

The `kubernets/control-plane` role writes configuration for the target version before control plane pods are upgraded.

However, since the `AuthorizationConfiguration` file is reconciled continuously, this leads to a race condition where a new configuration version can be reconciled before kube-apiserver is upgraded to the compatible version.

This solution ensures the correct configuration is available throughout the process by writing each api version to a different file path. Unused file versions are cleaned up post-upgrade for better hygiene.

* Avoid from_json in cleanup task
2025-03-21 08:48:31 -07:00
Farshad Asadpour
bab6a9bf64 [argocd] Bump ArgoCD version to 2.14.5 (#12041)
According to the official ArgoCD documentation, version 2.14.5 is required
for compatibility with Kubernetes version 1.31

Reference: https://argo-cd.readthedocs.io/en/stable/operator-manual/installation/\#tested-versions

docs(argocd): update version references in documentation and configuration
2025-03-21 05:44:31 -07:00
dependabot[bot]
f70ace5300 build(deps): bump cryptography from 44.0.1 to 44.0.2 (#12004)
Bumps [cryptography](https://github.com/pyca/cryptography) from 44.0.1 to 44.0.2.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/44.0.1...44.0.2)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-21 01:30:33 -07:00
Max Gautier
c00fc9f221 Check separately for 'v'-prefixed versions for ones which are derived (#12057)
The versions which are by default derived from `kube_version` can break
the assert if kube_version start with `v`, because they use the start of
`kube_version` as dict key.
By putting them in their own assert, the first assert should trigger on
`kube_version`, with a more explicit error.
2025-03-20 02:36:31 -07:00
Aman Shrivastava
5abaf8cdf4 added dropdown in OS field of the bug-report issue form and github workflow to auto lable (#11880) 2025-03-19 01:36:30 -07:00
Kubernetes Prow Robot
02438442b9 Merge pull request #12050 from VannTen/cleanup/next_version_auto
Auto compute previous tag in CI
2025-03-18 03:09:50 -07:00
Max Gautier
03b40e71a3 Cleanup unneeded vagrant python packages from test deps (#12051) 2025-03-18 02:19:50 -07:00
Max Gautier
e0920b33d7 vagrant: allow to config vagrant with separate etcd (#12045)
This should help test separate etcd configuration locally more easily.
2025-03-18 02:15:49 -07:00
Ekko
56d37da105 Bump kube patch versions (#12052)
Signed-off-by: ekko <lihai.tu@daocloud.io>
2025-03-18 02:05:49 -07:00
Max Gautier
fffc1b4ac0 CI: remove KUBESPRAY_VERSION 2025-03-17 18:13:11 +01:00
Max Gautier
c79b3ce46b CI: convert galaxy version check to pre-commit + autodetect 2025-03-17 18:13:10 +01:00
Max Gautier
0c59cc84dd CI: simplify rebase.sh
With the base ref (aka: target branch) available, we don't need to guess
stuff from the branch name
2025-03-17 18:13:09 +01:00
Max Gautier
16b090c5ff CI: cleanup leftover files (#12040) 2025-03-16 20:05:47 -07:00
Not Darko
9e861cf816 [calico] fix: kubecontrollersconfigurations list permission (#12035)
[WARNING][1] kube-controllers/runconfig.go 193: unable to list KubeControllersConfiguration(default) error=connection is unauthorized: kubecontrollersconfigurations.crd.projectcalico.org "default" is forbidden: User "system:serviceaccount:kube-system:calico-kube-controllers" cannot list resource "kubecontrollersconfigurations" in API group "crd.projectcalico.org" at the cluster scope
2025-03-15 05:39:48 -07:00
Fredrik Liv
04a8adb17a Upcloud: Added support for routers and gateways (#11386)
* Upcloud: Added support for routers and gateways

* Upcloud: Added ipsec properties for UpCloud gateway VPN

* Upcloud: Added support for deprecated network field for loadbalancers
2025-03-15 02:05:46 -07:00
Kubernetes Prow Robot
986f461ef1 Merge pull request #12026 from VannTen/ci/convert_vagrant_to_kubevirt
CI: use kubevirt for molecule provisionning instead of vagrant
2025-03-13 08:57:48 -07:00
Max Gautier
96aadc3614 CI: only define mode in tests if needeed (#12030)
All tests have a mode actually, so changing the default is no big deal
(and makes more sense to have `default` as the default
2025-03-13 03:47:46 -07:00
Max Gautier
0efb415ec6 CI: molecule fixes
- use become: true when running bootstrap-os in molecule
- match host name in bastion-ssh-config to not confuse the provisionner
2025-03-13 10:14:49 +01:00
Max Gautier
71ae3c78e2 CI: ensure kubevirt VMs are available as part of provisionning
Allow to use the provisionning playbook from multiples users without
duplicating the ssh check.
2025-03-13 10:14:49 +01:00
Max Gautier
315313dd10 CI: convert molecule jobs to parallel:matrix
With the new provisionning using kubevirt this should be faster.
2025-03-13 10:14:48 +01:00
Max Gautier
f70c33d71a CI: convert molecule provisioning to kubevirt
Includes some change of OS version when the vagrant boxes were old ones
(Debian 10)
2025-03-13 10:14:48 +01:00
Max Gautier
ac4c41e4e6 CI: use OS name in VMs
Allows an easier log reading on multi-OS test runs (such as molecule
tests)
2025-03-13 10:14:47 +01:00
Max Gautier
611f645907 CI: Generate ssh key pair on the fly
There is litte reason to share an ssh key common to all CI jobs, so
generate one for each on the fly.

Also use plain-text cloud-init config instead of base64 for readability
2025-03-13 10:14:46 +01:00
Max Gautier
e4905f1d1d CI: convert molecule jobs to kubevirt 2025-03-13 10:14:46 +01:00
Max Gautier
acc843a5fa CI: convert containerd molecule to use kubevirt provisionning 2025-03-13 10:14:45 +01:00
Max Gautier
e62bbe0c76 CI: adapt packet-ci role to act as a molecule provisioner
To work with molecule, we need to use the name provided by molecule_yml
in inventory.

Inject the name in the VirtualMachineInstance (with a default to handle
non-molecule scenario) and get it back as part of inventory).

Account for no ansible groups
2025-03-13 10:14:45 +01:00
Max Gautier
d0f91adde4 CI: allow cluster layout to be fully defined by individual test 2025-03-13 10:14:44 +01:00
Max Gautier
a8d494fb95 CI/kubevirt: allow every vars in kubevirt template to be overriden
The current templating of kubevirt VirtualMachine relies on global
ansible variables, except for the group the nodes are meant to be in.

In order to have more flexibility (in particular, mixed OS cluster for
instances), expect now an abitrary  dict to be passed to the template ;
this allows to embed directly in the nodes definition any variable used
by the template.
2025-03-13 10:14:44 +01:00
Max Gautier
119fa5b0c0 CI: Remove cruft
Remove unused variables, and intermediary ones.
2025-03-13 10:14:43 +01:00
peterw
c8d75effcb cilium 1.17: fix etcd trusted-ca-file config var (#11986) 2025-03-10 21:19:46 -07:00
fullstackjam
aa6aa1522b Fix syntax error in k8s-net-cilium.yml (#12023) 2025-03-10 20:25:46 -07:00
Max Gautier
e2d86c3413 Update patch versions: cri-o, cri-dockerd, gvisor, etcd, cilium-cli, cni (#12024) 2025-03-10 20:05:45 -07:00
Max Gautier
7de9350c07 Remove download_hash.sh (#12025)
The script is obsoleted by 5d7236ea5 (Merge pull request #11890 from
VannTen/download_graphql_checksums_2, 2025-03-09), since the format of
checksums is no longer compatible.
2025-03-10 03:25:46 -07:00
Kubernetes Prow Robot
5d7236ea5f Merge pull request #11890 from VannTen/download_graphql_checksums_2
Adapt checksums and versions to new hashes updater
2025-03-09 20:19:45 -07:00
ChengHao Yang
7c611890c3 Fix: CRI-O default capabilities follow with the upstream (#12018)
* Fix: CRI-O default capabilities follow with the upstream

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Docs: CRI-O default capabilities follow with upstream

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

---------

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-03-05 23:55:50 -08:00
Ekko
6d4714b66e Make main_access_ip cacheable in facts (#12020)
Signed-off-by: ekko <lihai.tu@daocloud.io>
2025-03-05 23:51:46 -08:00
Max Gautier
299178e587 Update README.md with v-less versions
Also small spacing fix.
2025-03-05 16:19:52 +01:00
Max Gautier
fec663a27a Assert that versions have been stripped of leading 'v'
Only check version that we previously prefixed with 'v'.
This commit should be reverted as soon as 2.28 is released.
2025-03-05 16:18:39 +01:00
Max Gautier
f9a263090a Propagate v-less version everywhere 2025-03-05 16:18:39 +01:00
Max Gautier
800c84dcc9 download: use full hash form (including hash alg)
Allow the use of different hashes, as support by the get_url
Ansible module.
Change the variable name accordingly to 'checksum' since it's not
exclusively sha256 anymore.
2025-03-05 16:18:38 +01:00
Max Gautier
f6ae46c9d8 download: remove 'v' version prefix from defaults
also remove the version key in all dict in downloads since it's not used
anywhere.
2025-03-05 16:18:37 +01:00
Max Gautier
8467724aab checksums: use full gvisor version
The versions are nearly all .0 because of the gvisor release scheme.
This means they need to be quoted in yaml to be considered strings.
Special casing by removing the .0 make tooling more complicated, and it
does not gain us anything apart from a nicer looking file (I guess).

So just use the version of upstream gvisor and quote it.
2025-03-05 16:18:37 +01:00
Max Gautier
12270243f5 checksums: Use full hash form (including hash algo as prefix) 2025-03-05 16:18:33 +01:00
Max Gautier
977e41ac5e checksums: remove v from versions 2025-03-05 16:17:29 +01:00
Max Gautier
359467b525 checksums: remove checksums with value == 0
With the refactored script, this would try to download those releases
each time.
2025-03-05 16:16:07 +01:00
Mohamed Omar Zaian
8f4b7f9f5d [containerd] Make containerd v2.0.3, update hashes for other releases (#12011) 2025-03-04 11:59:44 -08:00
Mohamed Omar Zaian
95f059d2c1 [calico] Make calico v3.29.2, update hashes for other releases (#12012) 2025-03-04 11:55:44 -08:00
Jean-Vincent kassi
358bacf7ea Add crio_default_capabilities variables and documentation (#11989) 2025-03-04 04:09:42 -08:00
Mohamed Omar Zaian
0632f23a63 Make kubernetes 1.32.2 default (#12003) 2025-03-03 04:53:15 -08:00
Max Gautier
a665b43854 CI: Put pre-commit cache under CI_PROJECT_DIR (#11929)
* CI: Put pre-commit cache under CI_PROJECT_DIR

Apparently gitlab-runner can't cache stuff outside of the project
directory.

Put the cache under CI_PROJECT_DIR to make it work (which also means we
need to ignore it from ansible-lint).

Also update the pre-commit image while we're at it.

Link: https://gitlab.com/gitlab-org/gitlab/-/issues/14151

* update ansible-lint pre-commit
2025-03-03 01:30:58 -08:00
Mathieu Parent
7590d95976 Bump node-local-dns (k8s-dns-node-cache) image (#11981) 2025-03-02 19:34:56 -08:00
Chad Swenson
87f7363e46 kubeadm config validate flag for debugging and development (#12001)
This adds a new flag with default `kubeadm_config_validate_enabled: true` to use when debugging features and enhancements affected by the `kubeadm config validate command`.

This new flag should be set to `false` only for development and testing scenarios where validation is expected to fail (pre-release Kubernetes versions, etc).

While working with development and test versions of Kubernetes and Kubespray, I found this option very useful.
2025-03-02 18:50:57 -08:00
Mathieu Parent
1b2e66cd30 Update kube-vip to v0.8.9 (#11983) 2025-02-27 22:00:27 -08:00
ERIK
768fbeff0b update etcd snapshot count (#11997)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2025-02-27 01:30:32 -08:00
Antoine Legrand
7d21a54dc7 Remove amazon-linux2 from CI: issue with vm creation (#11995) 2025-02-26 05:42:30 -08:00
Loong Dai
e0a141ab12 README: add docker run (#11967)
* README: add docker run

Signed-off-by: Loong <long0dai@foxmail.com>

* Update README.md

Co-authored-by: Robert Neumann <r0b2g1t@users.noreply.github.com>

* Update README.md

Co-authored-by: Robert Neumann <r0b2g1t@users.noreply.github.com>

---------

Signed-off-by: Loong <long0dai@foxmail.com>
Co-authored-by: Robert Neumann <r0b2g1t@users.noreply.github.com>
2025-02-23 04:14:27 -08:00
A_tian
e332375293 Fix uncached_hosts syntax error (#11985)
* Fix uncached_hosts syntax error

Signed-off-by: panyintian.fu <1416297674@qq.com>

* Update roles/kubernetes/preinstall/tasks/0040-verify-settings.yml

Co-authored-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

---------

Signed-off-by: panyintian.fu <1416297674@qq.com>
Co-authored-by: Antoine Legrand <2t.antoine@gmail.com>
Co-authored-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-02-21 12:38:27 -08:00
Ekko
a60ec1dbde Fix architecture identifier of kata containers download url (#11975)
Signed-off-by: ekko <lihai.tu@daocloud.io>
2025-02-19 00:04:26 -08:00
Ekko
31c470137f Fix containerd 2.x configuration (#11963)
Signed-off-by: ekko <lihai.tu@daocloud.io>
2025-02-17 21:42:11 -08:00
dependabot[bot]
011e839f52 build(deps): bump cryptography from 44.0.0 to 44.0.1 (#11971)
Bumps [cryptography](https://github.com/pyca/cryptography) from 44.0.0 to 44.0.1.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/44.0.0...44.0.1)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-02-17 18:38:12 -08:00
Jan Breitkopf
d7962fb46e fix quotations in dhclient hooks (#11946) 2025-02-15 04:16:20 -08:00
Moritz Graf
dbb9900085 Update main.yml (#11943)
Adding single quotees around parameters in crictl.
2025-02-14 07:56:24 -08:00
Max Gautier
e24216bedc Automatically derive defaults versions from checksums (#11906)
* Automatically derive defaults versions from checksums

Currently, when updating checksums, we manually update the default
versions.
However, AFAICT, for all components where we have checksums, we're using
the newest version out of those checksums.

Codify this in the `_version` defaults variables definition to make the
process automatic and reduce manual steps (as well as  the diff size
during reviews).

We assume the versions are sorted, with newest first. This should be
guaranteed by the pre-commit hooks.

* Validate checksums are ordered by versions, newest first

* Generalize render-readme-versions hook for other static files

The pre-commit hook introduced a142f40e2 (Update versions in README.md
with pre-commit, 2025-01-21) allow to update our README with new
versions.
It turns out other "static" files (== which don't interpret Ansible
variables) also use the default version (in that case, our Dockefiles,
but there might be others)
The Dockerfile breaks if the variable they use (`kube_version`) is a
Jinja template.

For helping with automatic version upgrade, generalize the hook to deal
with other static files, and make a template out of the Dockerfile.

* Dockerfile: template kube_version with pre-commit instead of runtime

* Validate all versions/checksums are strings in pre-commit

All the ansible/python tooling for version is for version strings. YAML
unhelpfully consider some stuff as number, so enforce this.

* Stringify checksums versions
2025-02-14 00:28:21 -08:00
Boris
a51e7dd07d refact ip stack (#11953) 2025-02-11 03:37:58 -08:00
Bas
c557adf911 bugfix (#11959) 2025-02-08 03:01:55 -08:00
Ali Afsharzadeh
d10a2cd4c6 Update load balancers versions to Nginx 1.27, Haproxy 3.1 (#11928) 2025-02-07 19:11:57 -08:00
Raul Butuc
cfad1bd420 Fix incorrect syntax for secondary nodelocaldns manifest (#11952)
* Fix incorrect syntax

* Fix incorrect syntax
2025-02-07 07:09:55 -08:00
Kay Yan
08b77b5350 Fix CI by exclude the .ansible in .ansible-lint & remove ctr image pull workaround (#11948)
* exclude .ansible in ansible-lint

* remote ctr i pull workdaround

Signed-off-by: Kay Yan <kay.yan@daocloud.io>

---------

Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2025-02-06 03:53:56 -08:00
Mohamed Omar Zaian
fe0a1f4e42 Bump nerdctl to v2.0.3 (#11913) 2025-02-05 00:50:16 -08:00
Mohamed Omar Zaian
624937d137 [containerd] Support containerd v2.0.x (#11845) 2025-02-04 23:32:15 -08:00
Antoine Legrand
4373c1be1d Revert "Add support for ipv6 only cluster via "enable_ipv6only_stack_networks…" (#11941)
This reverts commit 76c0a3aa75.
2025-02-03 07:06:58 -08:00
Bas
59e1638ae1 Bugfix/11936 - backup: "{{ leave_etc_backup_files }}" (#11937)
* Adding the var: leave_etc_backup_files

* Fix for #11936 - backup: "{{ leave_etc_backup_files }}"
2025-01-30 06:19:23 -08:00
dependabot[bot]
6af849089e build(deps): bump the molecule group with 2 updates (#11933)
Bumps the molecule group with 2 updates: [molecule](https://github.com/ansible-community/molecule) and [molecule-plugins[vagrant]](https://github.com/ansible-community/molecule-plugins).


Updates `molecule` from 24.12.0 to 25.1.0
- [Release notes](https://github.com/ansible-community/molecule/releases)
- [Commits](https://github.com/ansible-community/molecule/compare/v24.12.0...v25.1.0)

Updates `molecule-plugins[vagrant]` from 23.6.0 to 23.7.0
- [Release notes](https://github.com/ansible-community/molecule-plugins/releases)
- [Commits](https://github.com/ansible-community/molecule-plugins/compare/v23.6.0...v23.7.0)

---
updated-dependencies:
- dependency-name: molecule
  dependency-type: direct:production
  update-type: version-update:semver-major
  dependency-group: molecule
- dependency-name: molecule-plugins[vagrant]
  dependency-type: direct:production
  update-type: version-update:semver-minor
  dependency-group: molecule
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-01-28 07:05:26 -08:00
Arthur Outhenin-Chalandre
46e1fbcdd9 dependabot: add group for molecule (#11927)
Signed-off-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
2025-01-28 00:59:23 -08:00
Max Gautier
1567e8ee6c Add timestamp to kaniko builds (#11923)
The build steps at the start of CI takes about 2 minutes; now that we
have greatly reduced the overall duration, this is not an insignificant
impact.

Add timestamps to the build process to see measure which steps of the
image build take the most time.
2025-01-27 06:17:23 -08:00
Boris
76c0a3aa75 Add support for ipv6 only cluster via "enable_ipv6only_stack_networks" (#11831) 2025-01-27 04:15:22 -08:00
Qasim Mehmood
e107022b4b Publish the ingress-nginx service address if manual address not defined and not using host network (#11879) 2025-01-24 00:47:21 -08:00
Anshuman Agarwala
ebcf9c3fff Updated sample in inventory (#11895)
* Updated sample in inventory

* Review changes
2025-01-23 21:39:21 -08:00
Max Gautier
d23c1464c9 Remove krew support (#11824)
* Remove krew installation support

Krew is fundamentally to install kubectl plugins, which are eminently a
client side things.
It's also not difficult to install on a client machine.

* Remove krew cleanup
2025-01-23 20:45:21 -08:00
Kubernetes Prow Robot
cbd0b7bbc3 Merge pull request #11901 from VannTen/cleanup/verify_settings
Cleanup of preinstall assertions
2025-01-23 08:40:58 -08:00
Max Gautier
67a73764e4 Remove deprecation checks admission plugins list
This assertion is present since 2022, users inventories' should be clean
from it now.
2025-01-23 14:32:43 +01:00
Max Gautier
fba31beb07 Remove containerd_config assert
This assert is present since 2021, we can assume now users have removed
it from their inventories.
2025-01-23 14:32:43 +01:00
Max Gautier
775361206c Drop compatibility for etcd_kubeadm_enabled
This has been deprecated for a long time, time to pull the plug.
We leave an assert for one release to have a straightforward failure if
some users were still using the variable.
2025-01-23 14:32:42 +01:00
Max Gautier
12a2c5eaa8 verify_settings: consolidate choices validation 2025-01-23 14:32:42 +01:00
Max Gautier
ed789c9b97 etcd_kubeadm simplify assert 2025-01-23 14:32:41 +01:00
Max Gautier
85d9e3e2ae Don't check address space when using 'none' network plugin
Since 'none' can be, for instance, a manual calico deployment, don't
check whether there is enough ip for pods on a node, because the plugin
can use another mechanism than the podCIDR to allocate IPs.
2025-01-23 14:32:40 +01:00
Max Gautier
98cdb5348c verify settings: fix etcd assertion when implicity etcd group
When the etcd group is not specified we assume it's kube_control_plane.
In that case, etcd still can't be even, so instead of only checking the
etcd group we need to default to kube_control_plane
2025-01-23 14:30:28 +01:00
Max Gautier
f53552e56b verify_settings: Consolidate assert loop in one task 2025-01-23 14:30:26 +01:00
Max Gautier
277ab7339a verify_settings: fix bad task name + remove redundant conditions 2025-01-23 14:29:48 +01:00
Max Gautier
191f71afea Drop explicit k8s_cluster group in CI inventory (#11858)
This removes compatibility with releases below 2.27.0, now that it has
been released and that we're testing upgrades against it.
2025-01-23 02:34:58 -08:00
Max Gautier
bfe858ba06 CI: cleanup dependencies, pre-commit autoupdate (#11904)
ansible-lint and yamllint are run as pre-commit hooks, which are
installed by pre-commit directly. So there is no need to put them in
tests/requirements.txt.

So remove them and make it leaner.
2025-01-23 01:56:59 -08:00
Max Gautier
f8c4d5a899 Fix: hide 'ansible managed' balises in README.md (#11919)
[//]: -> apparently does not work for hiding on Github markdown
2025-01-23 01:34:58 -08:00
c-romeo
9008c40d0e fix Calico typha deployment issue: #11916 (#11917) 2025-01-23 01:05:01 -08:00
Kubernetes Prow Robot
5a7e1be070 Merge pull request #11905 from VannTen/feat/readme_template_version
Update README.md versions automatically in pre-commit
2025-01-22 19:42:37 -08:00
Max Gautier
2a7b50a016 calico: don't set calico-node cpu limits by default (#11914)
Upstream calico isn't doing that, and:
- this can cause throttling
- the cpu needed by calico is very cluster / workload dependent
- missing cpu limits will not starve other pods (unlike missing memory
  requests), because the kernel scheduler will still gives priority to
  other process in pods not exceeding their requests
2025-01-22 19:24:36 -08:00
Max Gautier
d2e51e777c CI: cleanup vars identical to kubespray defaults (#11903) 2025-01-21 05:46:37 -08:00
Max Gautier
89476b48e5 CI: scope stdout debug callback to kubespray test runs
The debug callback apparently breaks using ansible-playbook in
pre-commit, so scope the variables to only where we're using it instead.
2025-01-21 14:07:32 +01:00
Max Gautier
3f01d4725d Apply new pre-commit version updater 2025-01-21 12:10:43 +01:00
Max Gautier
a142f40e25 Update versions in README.md with pre-commit
Currently, versions in README.md need to be manually updated, and we
check it's done with a bash script.

Add a small utility playbook to add versions in README.md from their
actual default values, automatically.
This is done in pre-commit, and replace the scripted check ; instead it
will autofix the README.md, and fails in CI if needed.

We switch markdownlint behind the local hooks to gave it the opportunity
to catch a problem with the rendering.
2025-01-21 12:10:21 +01:00
Max Gautier
0e91000a04 CI: remove retry from jobs (#11899)
Since e8ee42280 (CI: remove deletion tasks of 'packet' VMs, 2024-09-13),
our tests appears to not be flakey anymore.
The current retry slow down the testing feedback on pull request.

Since it's not needed anymore, don't retry and fail fast.
2025-01-19 18:38:35 -08:00
Kubernetes Prow Robot
e73c2d081c Merge pull request #11898 from VannTen/cleanup/ci/run_without_sample
Run CI without the sample inventory
2025-01-17 08:00:36 -08:00
Max Gautier
5862bff044 ci: show pre-commit diff on failure
Sometimes the change done by pre-commit are not obvious, this should
help.
2025-01-17 16:22:58 +01:00
Max Gautier
b548ccbe7f Adapt CI/vagrant to run without sample inventory 2025-01-17 16:22:57 +01:00
Kubernetes Prow Robot
a5142e7dfd Merge pull request #11891 from VannTen/download_graphql
Overhaul of the python hashes updater
2025-01-17 04:16:07 -08:00
ChengHao Yang
3930919283 Cleanup OWNERS files in each folders (#11892)
* Cleanup not in k-sigs members OWNERS

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Cleanup inactive members on Kubespray

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

---------

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-01-15 12:10:34 -08:00
Kay Yan
b104bb7a57 [kubernetes] Support Kubernetes v1.32.0 with RHEL8 (#11885)
* [kubernetes] Support Kubernetes v1.32.0

* add workaround for RHEL8

Signed-off-by: Kay Yan <kay.yan@daocloud.io>

---------

Signed-off-by: Kay Yan <kay.yan@daocloud.io>
Co-authored-by: Mohamed Zaian <mohamedzaian@gmail.com>
2025-01-15 08:54:35 -08:00
Max Gautier
bc36e9d440 hash-updater: apply formatter 2025-01-15 14:34:48 +01:00
Max Gautier
d8629b8e7e download: separate static metadata into it's own file
By separating logic from data, we should make it easier to add new
components.
2025-01-15 14:32:49 +01:00
Bas
c84336b48c Contrib: upload2artifactory.py (#11886)
* Contrib: upload2artifactory.py

Signed-off-by: Bas Meijer <bas.meijer@enexis.nl>

* Pythonic

Signed-off-by: Bas Meijer <bas.meijer@enexis.nl>

* Suggested

Co-authored-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>

* upload2artifactory.py documentation.

---------

Signed-off-by: Bas Meijer <bas.meijer@enexis.nl>
Co-authored-by: Arthur Outhenin-Chalandre <arthur@cri.epita.fr>
2025-01-15 05:18:33 -08:00
Christian Kröger
403a73ac11 [ingress-nginx] expose custom tcp and udp ports in ingress-nginx-controller (#11850) 2025-01-15 05:14:33 -08:00
Fredrik Liv
5ca23e3bfe Changed to use first_kube_control_plane to parse kubeadm_certificate_key (#11875)
Co-authored-by: nvalembois <nvalembois@live.com>
2025-01-14 08:34:34 -08:00
Max Gautier
4d3f06e69e download: cleanup graphQL query
- remove unused parts in the response
- clarify variables names
2025-01-14 17:04:29 +01:00
Max Gautier
d17bd286ea download: allow excluding some component
This is handy when some component releases is buggy (missing file at the
download links) to not block everything else.

Move the filtering up the stack so we don't have to do it multiples
times.
2025-01-14 17:04:28 +01:00
Max Gautier
55cff4f3d3 download: get checksums file relative to git root
This means the update-hashes command can be run anywhere in Kubespray
repository without having to figure out the correct path.
2025-01-14 17:04:28 +01:00
Max Gautier
76e07daa12 download: put grapqQL query in package + read from importlib 2025-01-14 17:04:27 +01:00
Max Gautier
a551922c84 Adapt download.py to run as a package script 2025-01-14 17:04:27 +01:00
Max Gautier
ba3258d7f0 Move download_hash.py into a python package
Can operate on several branches without the need for backport
2025-01-14 17:04:26 +01:00
Max Gautier
9b56840d51 download: create pyproject.toml 2025-01-14 17:04:24 +01:00
Max Gautier
4351b47ebe download: convert to logging 2025-01-14 17:04:18 +01:00
Max Gautier
b08c5e8b14 download: Log Github rate-limit status 2025-01-14 17:02:29 +01:00
Kay Yan
3527cb1916 Update CI test from AlmaLinux8 to AlmaLinux9 (#11889)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2025-01-14 02:50:32 -08:00
Max Gautier
81790cab91 download: remove unneeded imports 2025-01-14 10:41:42 +01:00
Max Gautier
9fbc566d98 download: Support adding new versions and update the doc 2025-01-14 10:41:41 +01:00
Max Gautier
ff768cc9fe download: support multiple hash algorithm 2025-01-14 10:41:41 +01:00
Max Gautier
ff3d9a0443 download: Support for gvisor (part 2)
Gvisor releases, besides only being tags, have some particularities:
- they are of the form yyyymmdd.p -> this get interpreted as a yaml
  float, so we need to explicitely convert to string to make it work.
- there is no semver-like attached to the version numbers, but the API
  (= OCI container runtime interface) is expected to be stable (see
  linked discussion)
- some older tags don't have hashs for some archs

Link: https://groups.google.com/g/gvisor-users/c/SxMeHt0Yb6Y/m/Xtv7seULCAAJ
2025-01-14 10:41:40 +01:00
Max Gautier
6608efb2c4 download: compute version from Github tags for gvisor
Gvisor is the only one of our deployed components which use tags instead
of proper releases. So the tags scraping support will, for now, cater to
gvisor particularities, notably in the tag name format and the fact that
some older releases don't have the same URL scheme.
2025-01-14 10:41:39 +01:00
Max Gautier
479fda6355 download: support cri-dockerd, youki, kata, crun 2025-01-14 10:41:39 +01:00
Max Gautier
3a44411aa1 Support project using alternates names for arch
(the url should use `alt_arch` instead of `arch` for those)
2025-01-14 10:41:38 +01:00
Max Gautier
9334bc1fee support components with no premade hashes 2025-01-14 10:41:38 +01:00
Max Gautier
c94daa4ff5 download: Update yaml data with new hashes 2025-01-14 10:41:37 +01:00
Max Gautier
5be8155394 remove old loops and generators 2025-01-14 10:41:36 +01:00
Max Gautier
08913c4aa0 Don't use 'checksum' in the components names 2025-01-14 10:41:36 +01:00
Max Gautier
38dd224ffe Extract get_hash into it's own function
Also, always raise even for 404 not found (should not happen now that
we'll use GraphQL to find the exact set of versions)
2025-01-14 10:41:36 +01:00
Max Gautier
24c59cee59 download_hash: adapt download urls to v-less versions 2025-01-14 10:41:35 +01:00
Max Gautier
2be54b2bd7 Filter new versions for new ones and same minor releases
We're only interested in new patch releases for auto-update.
2025-01-14 10:41:35 +01:00
Max Gautier
ae68766015 Filter by github results InvalidVersion
Containerd use the same repository for releases of it's gRPC API (which
we are not interested in).
Conveniently, those releases have tags which are not valid version
number (being prefixed with 'api/').

This could also be potentially useful for similar cases.
The risk of missing releases because of this are low, since it would
require that a project issue a new release with an invalid format, then
switch back to the previous format (or we miss the fact it's not
updating for a long period of time).
2025-01-14 10:41:34 +01:00
Max Gautier
9f58ba60f3 download: compute new versions from Github API
We obtain the set of version from Github, then for each component we do
a set comparison to determine which versions we don't have.
2025-01-14 10:41:34 +01:00
Max Gautier
a6219c84c9 Put graphql query in it's own file 2025-01-14 10:41:33 +01:00
Max Gautier
7941be127d downloads: add graphql node ids
The Github graphQL API needs IDs for querying a variable array of
repository.

Use a dict for components instead of an array of url and record the
corresponding node ID for each component (there are duplicates because
some binaries are provided by the same project/repository).
2025-01-14 10:41:33 +01:00
Max Gautier
c938dfa634 scripts: get_nodes_ids.sh
Add the script used to obtain graphql node IDs from Github so it's
easier to add a new component.
2025-01-14 10:41:31 +01:00
ChengHao Yang
5a353cb04f Add manual option to the external_cloud_provider variable (#11883)
* Add `manual` option in the `external_cloud_provider` value

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Update external cloud provider description in roles & sample inventory

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

---------

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-01-13 00:12:34 -08:00
kyrie
1f186ed451 add containerd registry mirror certificate configuration (#11857)
Signed-off-by: KubeKyrie <shaolong.qin@daocloud.io>
2025-01-09 01:48:31 -08:00
Chad Swenson
8443f370d4 Structured AuthorizationConfiguration (#11852)
Adds the ability to configure the Kubernetes API server with a structured authorization configuration file.

Structured AuthorizationConfiguration is a new feature in Kubernetes v1.29+ (GA in v1.32) that configures the API server's authorization modes with a structured configuration file.
AuthorizationConfiguration files offer features not available with the `--authorization-mode` flag, although Kubespray supports both methods and authorization-mode remains the default for now.

Note: Because the `--authorization-config` and `--authorization-mode` flags are mutually exclusive, the `authorization_modes` ansible variable is ignored when `kube_apiserver_use_authorization_config_file` is set to true. The two features cannot be used at the same time.

Docs: https://kubernetes.io/docs/reference/access-authn-authz/authorization/#configuring-the-api-server-using-an-authorization-config-file
Blog + Examples: https://kubernetes.io/blog/2024/04/26/multi-webhook-and-modular-authorization-made-much-easier/
KEP: https://github.com/kubernetes/enhancements/tree/master/keps/sig-auth/3221-structured-authorization-configuration

I tested this all the way back to k8s v1.29 when AuthorizationConfiguration was first introduced as an alpha feature, although v1.29 required some additional workarounds with `kubeadm_patches`, which I included in example comments.

I also included some example comments with CEL expressions that allowed me to configure webhook authorizers without hitting kubeadm 1.29+ issues that block cluster creation and upgrades such as this one: https://github.com/kubernetes/cloud-provider-openstack/issues/2575.
My workaround configures the webhook to ignore requests from kubeadm and system components, which prevents fatal errors from webhooks that are not available yet, and should be authorized by Node or RBAC anyway.
2025-01-07 09:14:28 +01:00
ChengHao Yang
1801debaea Add Flatcar 4081.2.1 image to test-infra (#11849)
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-01-07 08:38:28 +01:00
Kay Yan
369be00960 increase the memory requirement to 2GB (#11864)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2025-01-07 08:00:28 +01:00
Kay Yan
ae1805587b cleaup for 2.27.0 (#11854)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2025-01-07 05:06:29 +01:00
Noam
55d1e4a4b5 enable bash completion tasks for Suse OS family (#11860)
* remove check for os family on bash completion tasks

* add Suse
2025-01-06 15:36:16 +01:00
Max Gautier
ac9b76eb2e Ignore Mem preflight errors on ubuntu upgrade testcase (#11859) 2025-01-06 11:52:16 +01:00
653 changed files with 9318 additions and 45255 deletions

View File

@@ -12,10 +12,12 @@ skip_list:
# (Disabled in June 2021)
- 'role-name'
# [var-naming] "defaults/main.yml" File defines variable 'apiVersion' that violates variable naming standards
# [var-naming]
# In Kubespray we use variables that use camelCase to match their k8s counterparts
# (Disabled in June 2021)
- 'var-naming'
- 'var-naming[pattern]'
# Variables names from within roles in kubespray don't need role name as a prefix
- 'var-naming[no-role-prefix]'
# [fqcn-builtins]
# Roles in kubespray don't need fully qualified collection names
@@ -37,5 +39,9 @@ exclude_paths:
- tests/files/custom_cni/cilium.yaml
- venv
- .github
- .ansible
- .cache
- .gitlab-ci.yml
- .gitlab-ci
mock_modules:
- gluster.gluster.gluster_volume

View File

@@ -36,11 +36,35 @@ body:
attributes:
value: '### Environment'
- type: textarea
- type: dropdown
id: os
attributes:
label: OS
placeholder: 'printf "$(uname -srm)\n$(cat /etc/os-release)\n"'
options:
- 'RHEL 9'
- 'RHEL 8'
- 'Fedora 40'
- 'Ubuntu 24'
- 'Ubuntu 22'
- 'Ubuntu 20'
- 'Debian 12'
- 'Debian 11'
- 'Flatcar Container Linux'
- 'openSUSE Leap'
- 'openSUSE Tumbleweed'
- 'Oracle Linux 9'
- 'Oracle Linux 8'
- 'AlmaLinux 9'
- 'AlmaLinux 8'
- 'Rocky Linux 9'
- 'Rocky Linux 8'
- 'Amazon Linux 2'
- 'Kylin Linux Advanced Server V10'
- 'UOS Linux 20'
- 'openEuler 24'
- 'openEuler 22'
- 'openEuler 20'
- 'Other|Unsupported'
validations:
required: true
@@ -84,7 +108,6 @@ body:
- meta
- multus
- ovn4nfv
- weave
validations:
required: true

View File

@@ -7,3 +7,15 @@ updates:
labels:
- dependencies
- release-note-none
groups:
molecule:
patterns:
- molecule
- molecule-plugins*
- package-ecosystem: "github-actions"
directory: "/"
labels:
- release-note-none
- ci-short
schedule:
interval: "weekly"

32
.github/workflows/auto-label-os.yml vendored Normal file
View File

@@ -0,0 +1,32 @@
name: Issue labeler
on:
issues:
types: [opened]
permissions:
contents: read
jobs:
label-component:
runs-on: ubuntu-latest
permissions:
issues: write
steps:
- uses: actions/checkout@1af3b93b6815bc44a9784bd300feb67ff0d1eeb3
- name: Parse issue form
uses: stefanbuck/github-issue-parser@2ea9b35a8c584529ed00891a8f7e41dc46d0441e
id: issue-parser
with:
template-path: .github/ISSUE_TEMPLATE/bug-report.yaml
- name: Set labels based on OS field
uses: redhat-plumbers-in-action/advanced-issue-labeler@b80ae64e3e156e9c111b075bfa04b295d54e8e2e
with:
issue-form: ${{ steps.issue-parser.outputs.jsonString }}
section: os
block-list: |
None
Other
token: ${{ secrets.GITHUB_TOKEN }}

View File

@@ -0,0 +1,55 @@
name: Upgrade Kubespray components with new patches versions - all branches
on:
schedule:
- cron: '22 2 * * *' # every day, 02:22 UTC
workflow_dispatch:
permissions: {}
jobs:
get-releases-branches:
if: github.repository == 'kubernetes-sigs/kubespray'
runs-on: ubuntu-latest
outputs:
branches: ${{ steps.get-branches.outputs.data }}
steps:
- uses: octokit/graphql-action@abaeca7ba4f0325d63b8de7ef943c2418d161b93
id: get-branches
with:
query: |
query get_release_branches($owner:String!, $name:String!) {
repository(owner:$owner, name:$name) {
refs(refPrefix: "refs/heads/",
first: 1, # TODO increment once we have release branch with the new checksums format
query: "release-",
orderBy: {
field: ALPHABETICAL,
direction: DESC
}) {
nodes {
name
}
}
}
}
variables: |
owner: ${{ github.repository_owner }}
name: ${{ github.event.repository.name }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
update-versions:
needs: get-releases-branches
strategy:
fail-fast: false
matrix:
branch:
- name: ${{ github.event.repository.default_branch }}
- ${{ fromJSON(needs.get-releases-branches.outputs.branches).repository.refs.nodes }}
uses: ./.github/workflows/upgrade-patch-versions.yml
permissions:
contents: write
pull-requests: write
name: Update patch updates on ${{ matrix.branch.name }}
with:
branch: ${{ matrix.branch.name }}

View File

@@ -0,0 +1,44 @@
on:
workflow_call:
inputs:
branch:
description: Which branch to update with new patch versions
default: master
required: true
type: string
jobs:
update-patch-versions:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@1af3b93b6815bc44a9784bd300feb67ff0d1eeb3
with:
ref: ${{ inputs.branch }}
- uses: actions/setup-python@v6
with:
python-version: '3.13'
cache: 'pip'
- run: pip install scripts/component_hash_update pre-commit
- run: update-hashes
env:
API_KEY: ${{ secrets.GITHUB_TOKEN }}
- uses: actions/cache@v4
with:
key: pre-commit-hook-propagate
path: |
~/.cache/pre-commit
- run: pre-commit run --all-files propagate-ansible-variables
continue-on-error: true
- uses: peter-evans/create-pull-request@84ae59a2cdc2258d6fa0732dd66352dddae2a412
with:
commit-message: Patch versions updates
title: Patch versions updates - ${{ inputs.branch }}
labels: bot
branch: component_hash_update/${{ inputs.branch }}
sign-commits: true
body: |
/kind feature
```release-note
NONE
```

View File

@@ -1,24 +1,30 @@
---
stages:
- build
- test
- deploy-part1
- deploy-extended
- build # build docker image used in most other jobs
- test # unit tests
- deploy-part1 # kubespray runs - common setup
- deploy-extended # kubespray runs - rarer or costlier (to test) setups
variables:
KUBESPRAY_VERSION: v2.26.0
FAILFASTCI_NAMESPACE: 'kargo-ci'
GITLAB_REPOSITORY: 'kargo-ci/kubernetes-sigs-kubespray'
GIT_CONFIG_COUNT: 2
GIT_CONFIG_KEY_0: user.email
GIT_CONFIG_VALUE_0: "ci@kubespray.io"
GIT_CONFIG_KEY_1: user.name
GIT_CONFIG_VALUE_1: "Kubespray CI"
ANSIBLE_FORCE_COLOR: "true"
ANSIBLE_STDOUT_CALLBACK: "debug"
MAGIC: "ci check this"
GS_ACCESS_KEY_ID: $GS_KEY
GS_SECRET_ACCESS_KEY: $GS_SECRET
CONTAINER_ENGINE: docker
SSH_USER: root
GCE_PREEMPTIBLE: "false"
ANSIBLE_KEEP_REMOTE_FILES: "1"
ANSIBLE_CONFIG: ./tests/ansible.cfg
ANSIBLE_REMOTE_USER: kubespray
ANSIBLE_PRIVATE_KEY_FILE: /tmp/id_rsa
ANSIBLE_INVENTORY: /tmp/inventory
ANSIBLE_STDOUT_CALLBACK: "default"
RESET_CHECK: "false"
REMOVE_NODE_CHECK: "false"
UPGRADE_TEST: "false"
@@ -26,12 +32,12 @@ variables:
ANSIBLE_VERBOSITY: 2
RECOVER_CONTROL_PLANE_TEST: "false"
RECOVER_CONTROL_PLANE_TEST_GROUPS: "etcd[2:]:kube_control_plane[1:]"
TERRAFORM_VERSION: 1.3.7
OPENTOFU_VERSION: v1.9.1
PIPELINE_IMAGE: "$CI_REGISTRY_IMAGE/pipeline:${CI_PIPELINE_ID}-${CI_COMMIT_SHORT_SHA}"
before_script:
- ./tests/scripts/rebase.sh
- mkdir -p /.ssh
- mkdir -p cluster-dump $ANSIBLE_INVENTORY
.job: &job
tags:
@@ -48,55 +54,13 @@ before_script:
extends: .job
needs:
- pipeline-image
- ci-not-authorized
- check-galaxy-version # lint
- pre-commit # lint
- vagrant-validate # lint
.testcases: &testcases
extends: .job-moderated
retry: 1
interruptible: true
before_script:
- update-alternatives --install /usr/bin/python python /usr/bin/python3 1
- ./tests/scripts/rebase.sh
- ./tests/scripts/testcases_prepare.sh
script:
- ./tests/scripts/testcases_run.sh
after_script:
- ./tests/scripts/testcases_cleanup.sh
# For failfast, at least 1 job must be defined in .gitlab-ci.yml
# Premoderated with manual actions
ci-not-authorized:
stage: build
before_script: []
after_script: []
rules:
# LGTM or ok-to-test labels
- if: $PR_LABELS =~ /.*,(lgtm|approved|ok-to-test).*|^(lgtm|approved|ok-to-test).*/i
variables:
CI_OK_TO_TEST: '0'
when: always
- if: $CI_PIPELINE_SOURCE == "schedule" || $CI_PIPELINE_SOURCE == "trigger"
variables:
CI_OK_TO_TEST: '0'
- if: $CI_COMMIT_BRANCH == "master"
variables:
CI_OK_TO_TEST: '0'
- when: always
variables:
CI_OK_TO_TEST: '1'
script:
- exit $CI_OK_TO_TEST
tags:
- ffci
needs: []
include:
- .gitlab-ci/build.yml
- .gitlab-ci/lint.yml
- .gitlab-ci/terraform.yml
- .gitlab-ci/packet.yml
- .gitlab-ci/kubevirt.yml
- .gitlab-ci/vagrant.yml
- .gitlab-ci/molecule.yml

View File

@@ -1,5 +1,5 @@
---
.build-container:
pipeline-image:
cache:
key: $CI_COMMIT_REF_SLUG
paths:
@@ -7,26 +7,24 @@
tags:
- ffci
stage: build
image:
name: gcr.io/kaniko-project/executor:debug
entrypoint: ['']
image: moby/buildkit:rootless
variables:
TAG: $CI_COMMIT_SHORT_SHA
PROJECT_DIR: $CI_PROJECT_DIR
DOCKERFILE: Dockerfile
GODEBUG: "http2client=0"
BUILDKITD_FLAGS: --oci-worker-no-process-sandbox
CACHE_IMAGE: $CI_REGISTRY_IMAGE/pipeline:cache
# TODO: remove the override
# currently rebase.sh depends on bash (not available in the kaniko image)
# once we have a simpler rebase (which should be easy if the target branch ref is available as variable
# we'll be able to rebase here as well hopefully
before_script:
- echo "{\"auths\":{\"$CI_REGISTRY\":{\"auth\":\"$(echo -n ${CI_REGISTRY_USER}:${CI_REGISTRY_PASSWORD} | base64)\"}}}" > /kaniko/.docker/config.json
- mkdir -p ~/.docker
- echo "{\"auths\":{\"$CI_REGISTRY\":{\"auth\":\"$(echo -n ${CI_REGISTRY_USER}:${CI_REGISTRY_PASSWORD} | base64)\"}}}" > ~/.docker/config.json
script:
- /kaniko/executor --cache=true
--cache-dir=image-cache
--context $PROJECT_DIR
--dockerfile $PROJECT_DIR/$DOCKERFILE
--label 'git-branch'=$CI_COMMIT_REF_SLUG
--label 'git-tag=$CI_COMMIT_TAG'
--destination $PIPELINE_IMAGE
pipeline-image:
extends: .build-container
variables:
DOCKERFILE: pipeline.Dockerfile
- |
buildctl-daemonless.sh build \
--frontend dockerfile.v0 \
--local context=$CI_PROJECT_DIR \
--local dockerfile=$CI_PROJECT_DIR \
--opt filename=pipeline.Dockerfile \
--export-cache type=registry,ref=$CACHE_IMAGE \
--import-cache type=registry,ref=$CACHE_IMAGE \
--output type=image,name=$PIPELINE_IMAGE,push=true

153
.gitlab-ci/kubevirt.yml Normal file
View File

@@ -0,0 +1,153 @@
---
.kubevirt:
extends: .job-moderated
interruptible: true
script:
- ansible-playbook tests/cloud_playbooks/create-kubevirt.yml
-e @"tests/files/${TESTCASE}.yml"
- ./tests/scripts/testcases_run.sh
variables:
ANSIBLE_TIMEOUT: "120"
tags:
- ffci
needs:
- pipeline-image
# TODO: generate testcases matrixes from the files in tests/files/
# this is needed to avoid the need for PR rebasing when a job was added or removed in the target branch
# (currently, a removed job in the target branch breaks the tests, because the
# pipeline definition is parsed by gitlab before the rebase.sh script)
# CI template for PRs
pr:
stage: deploy-part1
rules:
- if: $PR_LABELS =~ /.*ci-short.*/
when: manual
allow_failure: true
- if: $CI_COMMIT_BRANCH =~ /^pr-.*$/
when: on_success
- if: $CI_PIPELINE_SOURCE == "schedule" && $CI_PIPELINE_SCHEDULE_DESCRIPTION == "daily-ci"
when: on_success
- when: manual
allow_failure: true
extends: .kubevirt
parallel:
matrix:
- TESTCASE:
- almalinux9-crio
- almalinux9-kube-ovn
- debian11-calico-collection
- debian11-macvlan
- debian12-cilium
- debian13-cilium
- fedora39-kube-router
- openeuler24-calico
- rockylinux9-cilium
- ubuntu22-calico-all-in-one
- ubuntu22-calico-all-in-one-upgrade
- ubuntu24-calico-etcd-datastore
- ubuntu24-calico-all-in-one-hardening
- ubuntu24-cilium-sep
- ubuntu24-flannel-collection
- ubuntu24-kube-router-sep
- ubuntu24-kube-router-svc-proxy
- ubuntu24-ha-separate-etcd
- flatcar4081-calico
- fedora40-flannel-crio-collection-scale
# The ubuntu24-calico-all-in-one jobs are meant as early stages to prevent running the full CI if something is horribly broken
ubuntu24-calico-all-in-one:
stage: deploy-part1
extends: .kubevirt
variables:
TESTCASE: ubuntu24-calico-all-in-one
rules:
- if: $CI_COMMIT_BRANCH =~ /^pr-.*$/
when: on_success
- if: $CI_PIPELINE_SOURCE == "schedule" && $CI_PIPELINE_SCHEDULE_DESCRIPTION == "daily-ci"
when: on_success
- when: manual
allow_failure: true
pr_full:
extends: .kubevirt
stage: deploy-extended
rules:
- if: $PR_LABELS =~ /.*ci-full.*/
when: on_success
- if: $CI_PIPELINE_SOURCE == "schedule" && $CI_PIPELINE_SCHEDULE_DESCRIPTION == "daily-ci"
when: on_success
# Else run as manual
- when: manual
allow_failure: true
parallel:
matrix:
- TESTCASE:
- almalinux9-calico-ha-ebpf
- almalinux9-calico-nodelocaldns-secondary
- debian11-custom-cni
- debian11-kubelet-csr-approver
- debian12-custom-cni-helm
- fedora39-calico-swap-selinux
- fedora39-crio
- ubuntu24-calico-ha-wireguard
- ubuntu24-flannel-ha
- ubuntu24-flannel-ha-once
# Need an update of the container image to use schema v2
# update: quay.io/kubespray/vm-amazon-linux-2:latest
manual:
extends: pr_full
parallel:
matrix:
- TESTCASE:
- amazon-linux-2-all-in-one
rules:
- when: manual
allow_failure: true
pr_extended:
extends: .kubevirt
stage: deploy-extended
rules:
- if: $PR_LABELS =~ /.*(ci-extended|ci-full).*/
when: on_success
- if: $CI_PIPELINE_SOURCE == "schedule" && $CI_PIPELINE_SCHEDULE_DESCRIPTION == "daily-ci"
when: on_success
- when: manual
allow_failure: true
parallel:
matrix:
- TESTCASE:
- almalinux9-calico
- almalinux9-calico-remove-node
- almalinux9-docker
- debian11-docker
- debian12-calico
- debian12-docker
- debian13-calico
- rockylinux9-calico
- ubuntu22-all-in-one-docker
- ubuntu24-all-in-one-docker
- ubuntu24-calico-all-in-one
- ubuntu24-calico-etcd-kubeadm
- ubuntu24-flannel
# TODO: migrate to pr-full, fix the broken ones
periodic:
allow_failure: true
extends: .kubevirt
rules:
- if: $CI_PIPELINE_SOURCE == "schedule" && $CI_PIPELINE_SCHEDULE_DESCRIPTION == "daily-ci"
when: on_success
parallel:
matrix:
- TESTCASE:
- debian11-calico-upgrade
- debian11-calico-upgrade-once
- debian12-cilium-svc-proxy
- fedora39-calico-selinux
- fedora40-docker-calico
- ubuntu24-calico-etcd-kubeadm-upgrade-ha
- ubuntu24-calico-ha-recover
- ubuntu24-calico-ha-recover-noquorum

View File

@@ -3,15 +3,17 @@ pre-commit:
stage: test
tags:
- ffci
image: 'ghcr.io/pre-commit-ci/runner-image@sha256:aaf2c7b38b22286f2d381c11673bec571c28f61dd086d11b43a1c9444a813cef'
image: 'ghcr.io/pre-commit-ci/runner-image@sha256:fe01a6ec51b298412990b88627c3973b1146c7304f930f469bafa29ba60bcde9'
variables:
PRE_COMMIT_HOME: /pre-commit-cache
PRE_COMMIT_HOME: ${CI_PROJECT_DIR}/.cache/pre-commit
ANSIBLE_STDOUT_CALLBACK: default
script:
- pre-commit run --all-files
- pre-commit run --all-files --show-diff-on-failure
cache:
key: pre-commit-all
key: pre-commit-2
paths:
- /pre-commit-cache
- ${PRE_COMMIT_HOME}
when: 'always'
needs: []
vagrant-validate:
@@ -22,14 +24,3 @@ vagrant-validate:
VAGRANT_VERSION: 2.3.7
script:
- ./tests/scripts/vagrant-validate.sh
except: ['triggers', 'master']
# TODO: convert to pre-commit hook
check-galaxy-version:
needs: []
stage: test
tags: [ffci]
image: python:3
script:
- tests/scripts/check_galaxy_version.sh

View File

@@ -1,103 +1,55 @@
---
.molecule:
tags: [ffci-vm-med]
only: [/^pr-.*$/]
except: ['triggers']
image: quay.io/kubespray/vm-kubespray-ci:v13
services: []
tags: [ffci]
rules: # run on ci-short as well
- if: $CI_COMMIT_BRANCH =~ /^pr-.*$/
when: on_success
- if: $CI_PIPELINE_SOURCE == "schedule" && $CI_PIPELINE_SCHEDULE_DESCRIPTION == "daily-ci"
when: on_success
- when: manual
allow_failure: true
stage: deploy-part1
needs: []
# - ci-not-authorized
variables:
VAGRANT_DEFAULT_PROVIDER: "libvirt"
VAGRANT_HOME: "$CI_PROJECT_DIR/.vagrant.d"
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
before_script:
- mkdir -p $VAGRANT_HOME
- groups
- python3 -m venv citest
- source citest/bin/activate
- vagrant plugin expunge --reinstall --force --no-tty
- vagrant plugin install vagrant-libvirt
- pip install --no-compile --no-cache-dir pip -U
- pip install --no-compile --no-cache-dir -r $CI_PROJECT_DIR/requirements.txt
- pip install --no-compile --no-cache-dir -r $CI_PROJECT_DIR/tests/requirements.txt
- ./tests/scripts/rebase.sh
- ./tests/scripts/vagrant_clean.sh
image: $PIPELINE_IMAGE
needs:
- pipeline-image
script:
- ./tests/scripts/molecule_run.sh
after_script:
- ./tests/scripts/molecule_logs.sh
- rm -fr molecule_logs
- mkdir -p molecule_logs
- find ~/.cache/molecule/ \( -name '*.out' -o -name '*.err' \) -type f | xargs tar -uf molecule_logs/molecule.tar
- gzip molecule_logs/molecule.tar
artifacts:
when: always
paths:
- molecule_logs/
cache:
key: $CI_JOB_NAME_SLUG
paths:
- .vagrant.d/boxes
- .cache/pip
policy: pull-push # TODO: change to "pull" when not on main
# CI template for periodic CI jobs
# Enabled when PERIODIC_CI_ENABLED var is set
.molecule_periodic:
only:
variables:
- $PERIODIC_CI_ENABLED
allow_failure: true
molecule:
extends: .molecule
script:
- ./tests/scripts/molecule_run.sh -i $ROLE
parallel:
matrix:
- ROLE:
- container-engine/cri-dockerd
- container-engine/containerd
- container-engine/cri-o
- container-engine/gvisor
- container-engine/youki
- adduser
- bastion-ssh-config
- bootstrap_os
molecule_full:
extends: .molecule_periodic
molecule_no_container_engines:
extends: .molecule
script:
- ./tests/scripts/molecule_run.sh -e container-engine
when: on_success
molecule_docker:
extends: .molecule
script:
- ./tests/scripts/molecule_run.sh -i container-engine/cri-dockerd
when: on_success
molecule_containerd:
extends: .molecule
script:
- ./tests/scripts/molecule_run.sh -i container-engine/containerd
when: on_success
molecule_cri-o:
extends: .molecule
stage: deploy-part1
script:
- ./tests/scripts/molecule_run.sh -i container-engine/cri-o
allow_failure: true
when: on_success
# # Stage 3 container engines don't get as much attention so allow them to fail
# molecule_kata:
# extends: .molecule
# stage: deploy-extended
# script:
# - ./tests/scripts/molecule_run.sh -i container-engine/kata-containers
# when: manual
# # FIXME: this test is broken (perma-failing)
molecule_gvisor:
extends: .molecule
stage: deploy-extended
script:
- ./tests/scripts/molecule_run.sh -i container-engine/gvisor
when: manual
# FIXME: this test is broken (perma-failing)
molecule_youki:
extends: .molecule
stage: deploy-extended
script:
- ./tests/scripts/molecule_run.sh -i container-engine/youki
when: manual
# FIXME: this test is broken (perma-failing)
rules:
- if: $CI_PIPELINE_SOURCE == "schedule" && $CI_PIPELINE_SCHEDULE_DESCRIPTION == "daily-ci"
when: on_success
- when: manual
allow_failure: true
extends: molecule
parallel:
matrix:
- ROLE:
# FIXME : tests below are perma-failing
- container-engine/kata-containers

View File

@@ -1,246 +0,0 @@
---
.packet:
extends: .testcases
variables:
ANSIBLE_TIMEOUT: "120"
CI_PLATFORM: packet
SSH_USER: kubespray
tags:
- ffci
needs:
- pipeline-image
- ci-not-authorized
# CI template for PRs
.packet_pr:
stage: deploy-part1
rules:
- if: $PR_LABELS =~ /.*ci-short.*/
when: manual
allow_failure: true
- if: $CI_COMMIT_BRANCH =~ /^pr-.*$/
when: on_success
- when: manual
allow_failure: true
extends: .packet
## Uncomment this to have multiple stages
# needs:
# - packet_ubuntu20-calico-all-in-one
.packet_pr_short:
stage: deploy-part1
extends: .packet
rules:
- if: $CI_COMMIT_BRANCH =~ /^pr-.*$/
when: on_success
- when: manual
allow_failure: true
.packet_pr_manual:
extends: .packet_pr
stage: deploy-extended
rules:
- if: $PR_LABELS =~ /.*ci-full.*/
when: on_success
# Else run as manual
- when: manual
allow_failure: true
.packet_pr_extended:
extends: .packet_pr
stage: deploy-extended
rules:
- if: $PR_LABELS =~ /.*(ci-extended|ci-full).*/
when: on_success
- when: manual
allow_failure: true
# CI template for periodic CI jobs
# Enabled when PERIODIC_CI_ENABLED var is set
.packet_periodic:
only:
variables:
- $PERIODIC_CI_ENABLED
allow_failure: true
extends: .packet
# The ubuntu20-calico-all-in-one jobs are meant as early stages to prevent running the full CI if something is horribly broken
packet_ubuntu20-calico-all-in-one:
stage: deploy-part1
extends: .packet_pr_short
variables:
RESET_CHECK: "true"
# ### PR JOBS PART2
packet_ubuntu20-crio:
extends: .packet_pr_manual
packet_ubuntu22-calico-all-in-one:
extends: .packet_pr
packet_ubuntu22-calico-all-in-one-upgrade:
extends: .packet_pr
variables:
UPGRADE_TEST: graceful
packet_ubuntu24-calico-etcd-datastore:
extends: .packet_pr
packet_almalinux8-crio:
extends: .packet_pr
packet_almalinux8-kube-ovn:
extends: .packet_pr
packet_debian11-calico-collection:
extends: .packet_pr
packet_debian11-macvlan:
extends: .packet_pr
packet_debian12-cilium:
extends: .packet_pr
packet_rockylinux8-calico:
extends: .packet_pr
packet_rockylinux9-cilium:
extends: .packet_pr
variables:
RESET_CHECK: "true"
packet_amazon-linux-2-all-in-one:
extends: .packet_pr
packet_opensuse-docker-cilium:
extends: .packet_pr
packet_ubuntu20-cilium-sep:
extends: .packet_pr
packet_openeuler24-calico:
extends: .packet_pr
packet_ubuntu20-calico-all-in-one-hardening:
extends: .packet_pr
## Extended
packet_debian11-docker:
extends: .packet_pr_extended
packet_debian12-docker:
extends: .packet_pr_extended
packet_debian12-calico:
extends: .packet_pr_extended
packet_almalinux8-calico-remove-node:
extends: .packet_pr_extended
variables:
REMOVE_NODE_CHECK: "true"
REMOVE_NODE_NAME: "instance-3"
packet_rockylinux9-calico:
extends: .packet_pr_extended
packet_almalinux8-calico:
extends: .packet_pr_extended
packet_almalinux8-docker:
extends: .packet_pr_extended
packet_ubuntu24-calico-all-in-one:
extends: .packet_pr_extended
packet_ubuntu20-calico-etcd-kubeadm:
extends: .packet_pr_extended
packet_ubuntu24-all-in-one-docker:
extends: .packet_pr_extended
packet_ubuntu22-all-in-one-docker:
extends: .packet_pr_extended
# ### MANUAL JOBS
packet_fedora39-crio:
extends: .packet_pr_manual
packet_ubuntu20-flannel-ha:
extends: .packet_pr_manual
packet_ubuntu20-all-in-one-docker:
extends: .packet_pr_manual
packet_ubuntu20-flannel-ha-once:
extends: .packet_pr_manual
packet_fedora39-calico-swap-selinux:
extends: .packet_pr_manual
packet_almalinux8-calico-ha-ebpf:
extends: .packet_pr_manual
packet_almalinux8-calico-nodelocaldns-secondary:
extends: .packet_pr_manual
packet_debian11-custom-cni:
extends: .packet_pr_manual
packet_debian11-kubelet-csr-approver:
extends: .packet_pr_manual
packet_debian12-custom-cni-helm:
extends: .packet_pr_manual
packet_ubuntu20-calico-ha-wireguard:
extends: .packet_pr_manual
# PERIODIC
packet_fedora40-docker-calico:
stage: deploy-extended
extends: .packet_periodic
variables:
RESET_CHECK: "true"
packet_fedora39-calico-selinux:
stage: deploy-extended
extends: .packet_periodic
packet_ubuntu20-calico-etcd-kubeadm-upgrade-ha:
stage: deploy-extended
extends: .packet_periodic
variables:
UPGRADE_TEST: basic
packet_debian11-calico-upgrade-once:
stage: deploy-extended
extends: .packet_periodic
variables:
UPGRADE_TEST: graceful
packet_ubuntu20-calico-ha-recover:
stage: deploy-extended
extends: .packet_periodic
variables:
RECOVER_CONTROL_PLANE_TEST: "true"
RECOVER_CONTROL_PLANE_TEST_GROUPS: "etcd[2:]:kube_control_plane[1:]"
packet_ubuntu20-calico-ha-recover-noquorum:
stage: deploy-extended
extends: .packet_periodic
variables:
RECOVER_CONTROL_PLANE_TEST: "true"
RECOVER_CONTROL_PLANE_TEST_GROUPS: "etcd[1:]:kube_control_plane[1:]"
packet_debian11-calico-upgrade:
stage: deploy-extended
extends: .packet_periodic
variables:
UPGRADE_TEST: graceful
packet_debian12-cilium-svc-proxy:
stage: deploy-extended
extends: .packet_periodic

View File

@@ -1,17 +0,0 @@
---
# stub pipeline for dynamic generation
pre-commit:
tags:
- light
image: 'ghcr.io/pre-commit-ci/runner-image@sha256:aaf2c7b38b22286f2d381c11673bec571c28f61dd086d11b43a1c9444a813cef'
variables:
PRE_COMMIT_HOME: /pre-commit-cache
script:
- pre-commit run --all-files
cache:
key: pre-commit-$HOOK_ID
paths:
- /pre-commit-cache
parallel:
matrix:
- HOOK_ID:

View File

@@ -3,39 +3,41 @@
.terraform_install:
extends: .job
needs:
- ci-not-authorized
- pipeline-image
variables:
TF_VAR_public_key_path: "${ANSIBLE_PRIVATE_KEY_FILE}.pub"
TF_VAR_ssh_private_key_path: $ANSIBLE_PRIVATE_KEY_FILE
CLUSTER: $CI_COMMIT_REF_NAME
TERRAFORM_STATE_ROOT: $CI_PROJECT_DIR
stage: deploy-part1
before_script:
- update-alternatives --install /usr/bin/python python /usr/bin/python3 1
- ./tests/scripts/rebase.sh
- ./tests/scripts/testcases_prepare.sh
- ./tests/scripts/terraform_install.sh
# Set Ansible config
- cp ansible.cfg ~/.ansible.cfg
# Prepare inventory
- mkdir -p cluster-dump $ANSIBLE_INVENTORY
- ./tests/scripts/opentofu_install.sh
- cp contrib/terraform/$PROVIDER/sample-inventory/cluster.tfvars .
- ln -s contrib/terraform/$PROVIDER/hosts
- terraform -chdir="contrib/terraform/$PROVIDER" init
# Copy SSH keypair
- mkdir -p ~/.ssh
- echo "$PACKET_PRIVATE_KEY" | base64 -d > ~/.ssh/id_rsa
- chmod 400 ~/.ssh/id_rsa
- echo "$PACKET_PUBLIC_KEY" | base64 -d > ~/.ssh/id_rsa.pub
- mkdir -p contrib/terraform/$PROVIDER/group_vars
# Random subnet to avoid routing conflicts
- export TF_VAR_subnet_cidr="10.$(( $RANDOM % 256 )).$(( $RANDOM % 256 )).0/24"
- ln -rs -t $ANSIBLE_INVENTORY contrib/terraform/$PROVIDER/hosts
- tofu -chdir="contrib/terraform/$PROVIDER" init
.terraform_validate:
terraform_validate:
extends: .terraform_install
tags: [ffci]
only: ['master', /^pr-.*$/]
script:
- terraform -chdir="contrib/terraform/$PROVIDER" validate
- terraform -chdir="contrib/terraform/$PROVIDER" fmt -check -diff
- tofu -chdir="contrib/terraform/$PROVIDER" validate
- tofu -chdir="contrib/terraform/$PROVIDER" fmt -check -diff
stage: test
needs:
- pipeline-image
parallel:
matrix:
- PROVIDER:
- openstack
- aws
- exoscale
- hetzner
- vsphere
- upcloud
- nifcloud
.terraform_apply:
extends: .terraform_install
@@ -43,99 +45,24 @@
stage: deploy-extended
when: manual
only: [/^pr-.*$/]
artifacts:
when: always
paths:
- cluster-dump/
variables:
ANSIBLE_INVENTORY_UNPARSED_FAILED: "true"
ANSIBLE_INVENTORY: hosts
CI_PLATFORM: tf
TF_VAR_ssh_user: $SSH_USER
ANSIBLE_REMOTE_USER: ubuntu # the openstack terraform module does not handle custom user correctly
ANSIBLE_SSH_RETRIES: 15
TF_VAR_ssh_user: $ANSIBLE_REMOTE_USER
TF_VAR_cluster_name: $CI_JOB_ID
script:
# Set Ansible config
- cp ansible.cfg ~/.ansible.cfg
- ssh-keygen -N '' -f $ANSIBLE_PRIVATE_KEY_FILE -t rsa
- mkdir -p contrib/terraform/$PROVIDER/group_vars
# Random subnet to avoid routing conflicts
- export TF_VAR_subnet_cidr="10.$(( $RANDOM % 256 )).$(( $RANDOM % 256 )).0/24"
- tofu -chdir="contrib/terraform/$PROVIDER" apply -auto-approve -parallelism=1
- tests/scripts/testcases_run.sh
after_script:
# Cleanup regardless of exit code
- ./tests/scripts/testcases_cleanup.sh
tf-validate-openstack:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_VERSION
PROVIDER: openstack
CLUSTER: $CI_COMMIT_REF_NAME
tf-validate-equinix:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_VERSION
PROVIDER: equinix
CLUSTER: $CI_COMMIT_REF_NAME
tf-validate-aws:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_VERSION
PROVIDER: aws
CLUSTER: $CI_COMMIT_REF_NAME
tf-validate-exoscale:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_VERSION
PROVIDER: exoscale
tf-validate-hetzner:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_VERSION
PROVIDER: hetzner
tf-validate-vsphere:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_VERSION
PROVIDER: vsphere
CLUSTER: $CI_COMMIT_REF_NAME
tf-validate-upcloud:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_VERSION
PROVIDER: upcloud
CLUSTER: $CI_COMMIT_REF_NAME
tf-validate-nifcloud:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_VERSION
PROVIDER: nifcloud
# tf-packet-ubuntu20-default:
# extends: .terraform_apply
# variables:
# TF_VERSION: $TERRAFORM_VERSION
# PROVIDER: packet
# CLUSTER: $CI_COMMIT_REF_NAME
# TF_VAR_number_of_k8s_masters: "1"
# TF_VAR_number_of_k8s_nodes: "1"
# TF_VAR_plan_k8s_masters: t1.small.x86
# TF_VAR_plan_k8s_nodes: t1.small.x86
# TF_VAR_metro: am
# TF_VAR_public_key_path: ""
# TF_VAR_operating_system: ubuntu_20_04
.ovh_variables: &ovh_variables
OS_AUTH_URL: https://auth.cloud.ovh.net/v3
OS_PROJECT_ID: 8d3cd5d737d74227ace462dee0b903fe
OS_PROJECT_NAME: "9361447987648822"
OS_USER_DOMAIN_NAME: Default
OS_PROJECT_DOMAIN_ID: default
OS_USERNAME: 8XuhBMfkKVrk
OS_REGION_NAME: UK1
OS_INTERFACE: public
OS_IDENTITY_API_VERSION: "3"
- tofu -chdir="contrib/terraform/$PROVIDER" destroy -auto-approve
# Elastx is generously donating resources for Kubespray on Openstack CI
# Contacts: @gix @bl0m1
@@ -169,11 +96,8 @@ tf-elastx_ubuntu20-calico:
allow_failure: true
variables:
<<: *elastx_variables
TF_VERSION: $TERRAFORM_VERSION
PROVIDER: openstack
CLUSTER: $CI_COMMIT_REF_NAME
ANSIBLE_TIMEOUT: "60"
SSH_USER: ubuntu
TF_VAR_number_of_k8s_masters: "1"
TF_VAR_number_of_k8s_masters_no_floating_ip: "0"
TF_VAR_number_of_k8s_masters_no_floating_ip_no_etcd: "0"
@@ -194,46 +118,3 @@ tf-elastx_ubuntu20-calico:
TF_VAR_flavor_k8s_node: 3f73fc93-ec61-4808-88df-2580d94c1a9b # v1-standard-2
TF_VAR_image: ubuntu-20.04-server-latest
TF_VAR_k8s_allowed_remote_ips: '["0.0.0.0/0"]'
# OVH voucher expired, commenting job until things are sorted out
# tf-ovh_cleanup:
# stage: unit-tests
# tags: [light]
# image: python
# environment: ovh
# variables:
# <<: *ovh_variables
# before_script:
# - pip install -r scripts/openstack-cleanup/requirements.txt
# script:
# - ./scripts/openstack-cleanup/main.py
# tf-ovh_ubuntu20-calico:
# extends: .terraform_apply
# when: on_success
# environment: ovh
# variables:
# <<: *ovh_variables
# TF_VERSION: $TERRAFORM_VERSION
# PROVIDER: openstack
# CLUSTER: $CI_COMMIT_REF_NAME
# ANSIBLE_TIMEOUT: "60"
# SSH_USER: ubuntu
# TF_VAR_number_of_k8s_masters: "0"
# TF_VAR_number_of_k8s_masters_no_floating_ip: "1"
# TF_VAR_number_of_k8s_masters_no_floating_ip_no_etcd: "0"
# TF_VAR_number_of_etcd: "0"
# TF_VAR_number_of_k8s_nodes: "0"
# TF_VAR_number_of_k8s_nodes_no_floating_ip: "1"
# TF_VAR_number_of_gfs_nodes_no_floating_ip: "0"
# TF_VAR_number_of_bastions: "0"
# TF_VAR_number_of_k8s_masters_no_etcd: "0"
# TF_VAR_use_neutron: "0"
# TF_VAR_floatingip_pool: "Ext-Net"
# TF_VAR_external_net: "6011fbc9-4cbf-46a4-8452-6890a340b60b"
# TF_VAR_network_name: "Ext-Net"
# TF_VAR_flavor_k8s_master: "defa64c3-bd46-43b4-858a-d93bbae0a229" # s1-8
# TF_VAR_flavor_k8s_node: "defa64c3-bd46-43b4-858a-d93bbae0a229" # s1-8
# TF_VAR_image: "Ubuntu 20.04"
# TF_VAR_k8s_allowed_remote_ips: '["0.0.0.0/0"]'

View File

@@ -1,20 +1,16 @@
---
.vagrant:
extends: .testcases
needs:
- ci-not-authorized
vagrant:
extends: .job-moderated
variables:
CI_PLATFORM: "vagrant"
SSH_USER: "vagrant"
VAGRANT_DEFAULT_PROVIDER: "libvirt"
KUBESPRAY_VAGRANT_CONFIG: tests/files/${CI_JOB_NAME}.rb
KUBESPRAY_VAGRANT_CONFIG: tests/files/${TESTCASE}.rb
DOCKER_NAME: vagrant
VAGRANT_ANSIBLE_TAGS: facts
VAGRANT_HOME: "$CI_PROJECT_DIR/.vagrant.d"
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
tags: [ffci-vm-large]
# only: [/^pr-.*$/]
# except: ['triggers']
image: quay.io/kubespray/vm-kubespray-ci:v13
services: []
before_script:
@@ -28,44 +24,26 @@
- pip install --no-compile --no-cache-dir -r $CI_PROJECT_DIR/tests/requirements.txt
- ./tests/scripts/vagrant_clean.sh
script:
- vagrant up
- ./tests/scripts/testcases_run.sh
after_script:
- vagrant destroy -f
cache:
key: $CI_JOB_NAME_SLUG
paths:
- .vagrant.d/boxes
- .cache/pip
policy: pull-push # TODO: change to "pull" when not on main
vagrant_ubuntu20-calico-dual-stack:
stage: deploy-extended
extends: .vagrant
when: manual
# FIXME: this test if broken (perma-failing)
vagrant_ubuntu20-flannel:
stage: deploy-part1
extends: .vagrant
when: on_success
allow_failure: false
vagrant_ubuntu20-flannel-collection:
stage: deploy-extended
extends: .vagrant
when: manual
vagrant_ubuntu20-kube-router-sep:
stage: deploy-extended
extends: .vagrant
when: manual
# Service proxy test fails connectivity testing
vagrant_ubuntu20-kube-router-svc-proxy:
stage: deploy-extended
extends: .vagrant
when: manual
vagrant_fedora39-kube-router:
stage: deploy-extended
extends: .vagrant
when: manual
# FIXME: this test if broken (perma-failing)
rules:
- if: $PR_LABELS =~ /.*(ci-extended|ci-full).*/
when: on_success
- if: $CI_PIPELINE_SOURCE == "schedule" && $CI_PIPELINE_SCHEDULE_DESCRIPTION == "daily-ci"
when: on_success
- when: manual
allow_failure: true
parallel:
matrix:
- TESTCASE:
- ubuntu24-calico-dual-stack
- ubuntu24-calico-ipv6only-stack

View File

@@ -1,7 +1,7 @@
---
repos:
- repo: https://github.com/pre-commit/pre-commit-hooks
rev: v5.0.0
rev: v6.0.0
hooks:
- id: check-added-large-files
- id: check-case-conflict
@@ -15,19 +15,13 @@ repos:
- id: trailing-whitespace
- repo: https://github.com/adrienverge/yamllint.git
rev: v1.35.1
rev: v1.37.1
hooks:
- id: yamllint
args: [--strict]
- repo: https://github.com/markdownlint/markdownlint
rev: v0.12.0
hooks:
- id: markdownlint
exclude: "^.github|(^docs/_sidebar\\.md$)"
- repo: https://github.com/shellcheck-py/shellcheck-py
rev: v0.10.0.1
rev: v0.11.0.1
hooks:
- id: shellcheck
args: ["--severity=error"]
@@ -35,7 +29,7 @@ repos:
files: "\\.sh$"
- repo: https://github.com/ansible/ansible-lint
rev: v24.12.2
rev: v25.11.0
hooks:
- id: ansible-lint
additional_dependencies:
@@ -44,19 +38,13 @@ repos:
- distlib
- repo: https://github.com/golangci/misspell
rev: v0.6.0
rev: v0.7.0
hooks:
- id: misspell
exclude: "OWNERS_ALIASES$"
- repo: local
hooks:
- id: check-readme-versions
name: check-readme-versions
entry: tests/scripts/check_readme_versions.sh
language: script
pass_filenames: false
- id: collection-build-install
name: Build and install kubernetes-sigs.kubespray Ansible collection
language: python
@@ -82,6 +70,14 @@ repos:
- pathlib
- pyaml
- id: check-galaxy-version
name: Verify correct version for galaxy.yml
entry: scripts/galaxy_version.py
language: python
pass_filenames: false
additional_dependencies:
- ruamel.yaml
- id: jinja-syntax-check
name: jinja-syntax-check
entry: tests/scripts/check-templates.py
@@ -90,3 +86,25 @@ repos:
- jinja
additional_dependencies:
- jinja2
- id: propagate-ansible-variables
name: Update static files referencing default kubespray values
language: python
additional_dependencies:
- ansible-core>=2.16.4
entry: scripts/propagate_ansible_variables.yml
pass_filenames: false
- id: check-checksums-sorted
name: Check that our checksums are correctly sorted by version
entry: scripts/assert-sorted-checksums.yml
language: python
pass_filenames: false
additional_dependencies:
- ansible
- repo: https://github.com/markdownlint/markdownlint
rev: v0.12.0
hooks:
- id: markdownlint
exclude: "^.github|(^docs/_sidebar\\.md$)"

View File

@@ -40,7 +40,7 @@ Vagrant with VirtualBox or libvirt driver helps you to quickly spin test cluster
2. The [repo owners](OWNERS) will respond to your issue promptly.
3. Fork the desired repo, develop and test your code changes.
4. Install [pre-commit](https://pre-commit.com) and install it in your development repo.
5. Addess any pre-commit validation failures.
5. Address any pre-commit validation failures.
6. Sign the CNCF CLA (<https://git.k8s.io/community/CLA.md#the-contributor-license-agreement>)
7. Submit a pull request.
8. Work with the reviewers on their suggestions.

View File

@@ -1,6 +1,6 @@
# syntax=docker/dockerfile:1
# Use imutable image tags rather than mutable tags (like ubuntu:22.04)
# Use immutable image tags rather than mutable tags (like ubuntu:22.04)
FROM ubuntu:22.04@sha256:149d67e29f765f4db62aa52161009e99e389544e25a8f43c8c89d4a445a7ca37
# Some tools like yamllint need this
@@ -34,11 +34,9 @@ RUN --mount=type=bind,source=requirements.txt,target=requirements.txt \
SHELL ["/bin/bash", "-o", "pipefail", "-c"]
RUN --mount=type=bind,source=roles/kubespray-defaults/defaults/main/main.yml,target=roles/kubespray-defaults/defaults/main/main.yml \
KUBE_VERSION=$(sed -n 's/^kube_version: //p' roles/kubespray-defaults/defaults/main/main.yml) \
OS_ARCHITECTURE=$(dpkg --print-architecture) \
&& curl -L "https://dl.k8s.io/release/${KUBE_VERSION}/bin/linux/${OS_ARCHITECTURE}/kubectl" -o /usr/local/bin/kubectl \
&& echo "$(curl -L "https://dl.k8s.io/release/${KUBE_VERSION}/bin/linux/${OS_ARCHITECTURE}/kubectl.sha256")" /usr/local/bin/kubectl | sha256sum --check \
RUN OS_ARCHITECTURE=$(dpkg --print-architecture) \
&& curl -L "https://dl.k8s.io/release/v1.34.2/bin/linux/${OS_ARCHITECTURE}/kubectl" -o /usr/local/bin/kubectl \
&& echo "$(curl -L "https://dl.k8s.io/release/v1.34.2/bin/linux/${OS_ARCHITECTURE}/kubectl.sha256")" /usr/local/bin/kubectl | sha256sum --check \
&& chmod a+x /usr/local/bin/kubectl
COPY *.yml ./

View File

@@ -1,17 +1,13 @@
aliases:
kubespray-approvers:
- cristicalin
- floryut
- liupeng0518
- mzaian
- oomichi
- yankay
- ant31
- mzaian
- tico88612
- vannten
- yankay
kubespray-reviewers:
- cyclinder
- erikjiang
- mrfreezeex
- mzaian
- tico88612
- vannten
@@ -19,8 +15,12 @@ aliases:
kubespray-emeritus_approvers:
- atoms
- chadswen
- cristicalin
- floryut
- liupeng0518
- luckysb
- mattymo
- miouge1
- oomichi
- riverzhang
- woopstar

104
README.md
View File

@@ -15,6 +15,18 @@ You can get your invite [here](http://slack.k8s.io/)
Below are several ways to use Kubespray to deploy a Kubernetes cluster.
### Docker
Ensure you have installed Docker then
```ShellSession
docker run --rm -it --mount type=bind,source="$(pwd)"/inventory/sample,dst=/inventory \
--mount type=bind,source="${HOME}"/.ssh/id_rsa,dst=/root/.ssh/id_rsa \
quay.io/kubespray/kubespray:v2.29.0 bash
# Inside the container you may now run the kubespray playbooks:
ansible-playbook -i /inventory/inventory.ini --private-key /root/.ssh/id_rsa cluster.yml
```
### Ansible
#### Usage
@@ -75,59 +87,62 @@ vagrant up
## Supported Linux Distributions
- **Flatcar Container Linux by Kinvolk**
- **Debian** Bookworm, Bullseye
- **Ubuntu** 20.04, 22.04, 24.04
- **CentOS/RHEL** [8, 9](docs/operating_systems/centos.md#centos-8)
- **Debian** Bookworm, Bullseye, Trixie
- **Ubuntu** 22.04, 24.04
- **CentOS/RHEL** [8, 9](docs/operating_systems/rhel.md#rhel-8)
- **Fedora** 39, 40
- **Fedora CoreOS** (see [fcos Note](docs/operating_systems/fcos.md))
- **openSUSE** Leap 15.x/Tumbleweed
- **Oracle Linux** [8, 9](docs/operating_systems/centos.md#centos-8)
- **Alma Linux** [8, 9](docs/operating_systems/centos.md#centos-8)
- **Rocky Linux** [8, 9](docs/operating_systems/centos.md#centos-8)
- **Oracle Linux** [8, 9](docs/operating_systems/rhel.md#rhel-8)
- **Alma Linux** [8, 9](docs/operating_systems/rhel.md#rhel-8)
- **Rocky Linux** [8, 9](docs/operating_systems/rhel.md#rhel-8)
- **Kylin Linux Advanced Server V10** (experimental: see [kylin linux notes](docs/operating_systems/kylinlinux.md))
- **Amazon Linux 2** (experimental: see [amazon linux notes](docs/operating_systems/amazonlinux.md))
- **UOS Linux** (experimental: see [uos linux notes](docs/operating_systems/uoslinux.md))
- **openEuler** (experimental: see [openEuler notes](docs/operating_systems/openeuler.md))
Note: Upstart/SysV init based OS types are not supported.
Note:
- Upstart/SysV init based OS types are not supported.
- [Kernel requirements](docs/operations/kernel-requirements.md) (please read if the OS kernel version is < 4.19).
## Supported Components
<!-- BEGIN ANSIBLE MANAGED BLOCK -->
- Core
- [kubernetes](https://github.com/kubernetes/kubernetes) v1.31.4
- [etcd](https://github.com/etcd-io/etcd) v3.5.16
- [docker](https://www.docker.com/) v26.1
- [containerd](https://containerd.io/) v1.7.24
- [cri-o](http://cri-o.io/) v1.31.0 (experimental: see [CRI-O Note](docs/CRI/cri-o.md). Only on fedora, ubuntu and centos based OS)
- [kubernetes](https://github.com/kubernetes/kubernetes) 1.34.2
- [etcd](https://github.com/etcd-io/etcd) 3.5.25
- [docker](https://www.docker.com/) 28.3
- [containerd](https://containerd.io/) 2.1.5
- [cri-o](http://cri-o.io/) 1.34.3 (experimental: see [CRI-O Note](docs/CRI/cri-o.md). Only on fedora, ubuntu and centos based OS)
- Network Plugin
- [cni-plugins](https://github.com/containernetworking/plugins) v1.2.0
- [calico](https://github.com/projectcalico/calico) v3.29.1
- [cilium](https://github.com/cilium/cilium) v1.15.9
- [flannel](https://github.com/flannel-io/flannel) v0.22.0
- [kube-ovn](https://github.com/alauda/kube-ovn) v1.12.21
- [kube-router](https://github.com/cloudnativelabs/kube-router) v2.0.0
- [multus](https://github.com/k8snetworkplumbingwg/multus-cni) v3.8
- [weave](https://github.com/rajch/weave) v2.8.7
- [kube-vip](https://github.com/kube-vip/kube-vip) v0.8.0
- [cni-plugins](https://github.com/containernetworking/plugins) 1.8.0
- [calico](https://github.com/projectcalico/calico) 3.30.5
- [cilium](https://github.com/cilium/cilium) 1.18.4
- [flannel](https://github.com/flannel-io/flannel) 0.27.3
- [kube-ovn](https://github.com/alauda/kube-ovn) 1.12.21
- [kube-router](https://github.com/cloudnativelabs/kube-router) 2.1.1
- [multus](https://github.com/k8snetworkplumbingwg/multus-cni) 4.2.2
- [kube-vip](https://github.com/kube-vip/kube-vip) 0.8.0
- Application
- [cert-manager](https://github.com/jetstack/cert-manager) v1.15.3
- [coredns](https://github.com/coredns/coredns) v1.11.3
- [ingress-nginx](https://github.com/kubernetes/ingress-nginx) v1.12.0
- [krew](https://github.com/kubernetes-sigs/krew) v0.4.4
- [argocd](https://argoproj.github.io/) v2.11.0
- [helm](https://helm.sh/) v3.16.4
- [metallb](https://metallb.universe.tf/) v0.13.9
- [registry](https://github.com/distribution/distribution) v2.8.1
- [cert-manager](https://github.com/jetstack/cert-manager) 1.15.3
- [coredns](https://github.com/coredns/coredns) 1.12.1
- [ingress-nginx](https://github.com/kubernetes/ingress-nginx) 1.13.3
- [argocd](https://argoproj.github.io/) 2.14.5
- [helm](https://helm.sh/) 3.18.4
- [metallb](https://metallb.universe.tf/) 0.13.9
- [registry](https://github.com/distribution/distribution) 2.8.1
- Storage Plugin
- [cephfs-provisioner](https://github.com/kubernetes-incubator/external-storage) v2.1.0-k8s1.11
- [rbd-provisioner](https://github.com/kubernetes-incubator/external-storage) v2.1.1-k8s1.11
- [aws-ebs-csi-plugin](https://github.com/kubernetes-sigs/aws-ebs-csi-driver) v0.5.0
- [azure-csi-plugin](https://github.com/kubernetes-sigs/azuredisk-csi-driver) v1.10.0
- [cinder-csi-plugin](https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/cinder-csi-plugin/using-cinder-csi-plugin.md) v1.30.0
- [gcp-pd-csi-plugin](https://github.com/kubernetes-sigs/gcp-compute-persistent-disk-csi-driver) v1.9.2
- [local-path-provisioner](https://github.com/rancher/local-path-provisioner) v0.0.24
- [local-volume-provisioner](https://github.com/kubernetes-sigs/sig-storage-local-static-provisioner) v2.5.0
- [node-feature-discovery](https://github.com/kubernetes-sigs/node-feature-discovery) v0.16.4
- [aws-ebs-csi-plugin](https://github.com/kubernetes-sigs/aws-ebs-csi-driver) 0.5.0
- [azure-csi-plugin](https://github.com/kubernetes-sigs/azuredisk-csi-driver) 1.10.0
- [cinder-csi-plugin](https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/cinder-csi-plugin/using-cinder-csi-plugin.md) 1.30.0
- [gcp-pd-csi-plugin](https://github.com/kubernetes-sigs/gcp-compute-persistent-disk-csi-driver) 1.9.2
- [local-path-provisioner](https://github.com/rancher/local-path-provisioner) 0.0.32
- [local-volume-provisioner](https://github.com/kubernetes-sigs/sig-storage-local-static-provisioner) 2.5.0
- [node-feature-discovery](https://github.com/kubernetes-sigs/node-feature-discovery) 0.16.4
<!-- END ANSIBLE MANAGED BLOCK -->
## Container Runtime Notes
@@ -135,7 +150,7 @@ Note: Upstart/SysV init based OS types are not supported.
## Requirements
- **Minimum required version of Kubernetes is v1.29**
- **Minimum required version of Kubernetes is v1.30**
- **Ansible v2.14+, Jinja 2.11+ and python-netaddr is installed on the machine that will run Ansible commands**
- The target servers must have **access to the Internet** in order to pull docker images. Otherwise, additional configuration is required (See [Offline Environment](docs/operations/offline-environment.md))
- The target servers are configured to allow **IPv4 forwarding**.
@@ -149,10 +164,10 @@ Note: Upstart/SysV init based OS types are not supported.
Hardware:
These limits are safeguarded by Kubespray. Actual requirements for your workload can differ. For a sizing guide go to the [Building Large Clusters](https://kubernetes.io/docs/setup/cluster-large/#size-of-master-and-master-components) guide.
- Master
- Memory: 1500 MB
- Node
- Memory: 1024 MB
- Control Plane
- Memory: 2 GB
- Worker Node
- Memory: 1 GB
## Network Plugins
@@ -167,9 +182,6 @@ You can choose among ten network plugins. (default: `calico`, except Vagrant use
- [cilium](http://docs.cilium.io/en/latest/): layer 3/4 networking (as well as layer 7 to protect and secure application protocols), supports dynamic insertion of BPF bytecode into the Linux kernel to implement security services, networking and visibility logic.
- [weave](docs/CNI/weave.md): Weave is a lightweight container overlay network that doesn't require an external K/V database cluster.
(Please refer to `weave` [troubleshooting documentation](https://www.weave.works/docs/net/latest/troubleshooting/)).
- [kube-ovn](docs/CNI/kube-ovn.md): Kube-OVN integrates the OVN-based Network Virtualization with Kubernetes. It offers an advanced Container Network Fabric for Enterprises.
- [kube-router](docs/CNI/kube-router.md): Kube-router is a L3 CNI for Kubernetes networking aiming to provide operational

View File

@@ -12,7 +12,6 @@ The Kubespray Project is released on an as-needed basis. The process is as follo
1. (For major releases) On the `master` branch: bump the version in `galaxy.yml` to the next expected major release (X.y.0 with y = Y + 1), make a Pull Request.
1. (For minor releases) On the `release-X.Y` branch: bump the version in `galaxy.yml` to the next expected minor release (X.Y.z with z = Z + 1), make a Pull Request.
1. The corresponding version of [quay.io/kubespray/kubespray:vX.Y.Z](https://quay.io/repository/kubespray/kubespray) and [quay.io/kubespray/vagrant:vX.Y.Z](https://quay.io/repository/kubespray/vagrant) container images are built and tagged. See the following `Container image creation` section for the details.
1. (Only for major releases) The `KUBESPRAY_VERSION` in `.gitlab-ci.yml` is upgraded to the version we just released # TODO clarify this, this variable is for testing upgrades.
1. The release issue is closed
1. An announcement email is sent to `dev@kubernetes.io` with the subject `[ANNOUNCE] Kubespray $VERSION is released`
1. The topic of the #kubespray channel is updated with `vX.Y.Z is released! | ...`
@@ -46,7 +45,7 @@ The Kubespray Project is released on an as-needed basis. The process is as follo
* Minor releases can change components' versions, but not the major `kube_version`.
Greater `kube_version` requires a new major or minor release. For example, if Kubespray v2.0.0
is bound to `kube_version: 1.4.x`, `calico_version: 0.22.0`, `etcd_version: v3.0.6`,
is bound to `kube_version: 1.4.x`, `calico_version: 0.22.0`, `etcd_version: 3.0.6`,
then Kubespray v2.1.0 may be bound to only minor changes to `kube_version`, like v1.5.1
and *any* changes to other components, like etcd v4, or calico 1.2.3.
And Kubespray v3.x.x shall be bound to `kube_version: 2.x.x` respectively.

63
Vagrantfile vendored
View File

@@ -4,6 +4,8 @@
# For help on using kubespray with vagrant, check out docs/developers/vagrant.md
require 'fileutils'
require 'ipaddr'
require 'socket'
Vagrant.require_version ">= 2.0.0"
@@ -26,13 +28,14 @@ SUPPORTED_OS = {
"centos8-bento" => {box: "bento/centos-8", user: "vagrant"},
"almalinux8" => {box: "almalinux/8", user: "vagrant"},
"almalinux8-bento" => {box: "bento/almalinux-8", user: "vagrant"},
"almalinux9" => {box: "almalinux/9", user: "vagrant"},
"rockylinux8" => {box: "rockylinux/8", user: "vagrant"},
"rockylinux9" => {box: "rockylinux/9", user: "vagrant"},
"fedora39" => {box: "fedora/39-cloud-base", user: "vagrant"},
"fedora40" => {box: "fedora/40-cloud-base", user: "vagrant"},
"fedora39-arm64" => {box: "bento/fedora-39-arm64", user: "vagrant"},
"fedora40-arm64" => {box: "bento/fedora-40", user: "vagrant"},
"opensuse" => {box: "opensuse/Leap-15.4.x86_64", user: "vagrant"},
"opensuse" => {box: "opensuse/Leap-15.6.x86_64", user: "vagrant"},
"opensuse-tumbleweed" => {box: "opensuse/Tumbleweed.x86_64", user: "vagrant"},
"oraclelinux" => {box: "generic/oracle7", user: "vagrant"},
"oraclelinux8" => {box: "generic/oracle8", user: "vagrant"},
@@ -57,18 +60,27 @@ $subnet ||= "172.18.8"
$subnet_ipv6 ||= "fd3c:b398:0698:0756"
$os ||= "ubuntu2004"
$network_plugin ||= "flannel"
$inventory ||= "inventory/sample"
$inventories ||= [$inventory]
$inventories ||= []
# Setting multi_networking to true will install Multus: https://github.com/k8snetworkplumbingwg/multus-cni
$multi_networking ||= "False"
$download_run_once ||= "True"
$download_force_cache ||= "False"
# Modify those to have separate groups (for instance, to test separate etcd:)
# first_control_plane = 1
# first_etcd = 4
# control_plane_instances = 3
# etcd_instances = 3
$first_node ||= 1
$first_control_plane ||= 1
$first_etcd ||= 1
# The first three nodes are etcd servers
$etcd_instances ||= [$num_instances, 3].min
# The first two nodes are kube masters
$kube_master_instances ||= [$num_instances, 2].min
$control_plane_instances ||= [$num_instances, 2].min
# All nodes are kube nodes
$kube_node_instances ||= $num_instances
$kube_node_instances ||= $num_instances - $first_node + 1
# The following only works when using the libvirt provider
$kube_node_instances_with_disks ||= false
$kube_node_instances_with_disks_size ||= "20G"
@@ -89,6 +101,33 @@ $extra_vars ||= {}
host_vars = {}
def collect_networks(subnet, subnet_ipv6)
Socket.getifaddrs.filter_map do |iface|
next unless iface&.netmask&.ip_address && iface.addr
is_ipv6 = iface.addr.ipv6?
ip = IPAddr.new(iface.addr.ip_address.split('%').first)
ip_test = is_ipv6 ? IPAddr.new("#{subnet_ipv6}::0") : IPAddr.new("#{subnet}.0")
prefix = IPAddr.new(iface.netmask.ip_address).to_i.to_s(2).count('1')
network = ip.mask(prefix)
[IPAddr.new("#{network}/#{prefix}"), ip_test]
end
end
def subnet_in_use?(network_ips)
network_ips.any? { |net, test_ip| net.include?(test_ip) && test_ip != net }
end
network_ips = collect_networks($subnet, $subnet_ipv6)
if subnet_in_use?(network_ips)
puts "Invalid subnet provided, subnet is already in use: #{$subnet}.0"
puts "Subnets in use: #{network_ips.inspect}"
exit 1
end
# throw error if os is not supported
if ! SUPPORTED_OS.key?($os)
puts "Unsupported OS: #{$os}"
@@ -210,14 +249,20 @@ Vagrant.configure("2") do |config|
end
ip = "#{$subnet}.#{i+100}"
ip6 = "#{$subnet_ipv6}::#{i+100}"
node.vm.network :private_network,
:ip => ip,
:libvirt__guest_ipv6 => 'yes',
:libvirt__ipv6_address => "#{$subnet_ipv6}::#{i+100}",
:libvirt__ipv6_address => ip6,
:libvirt__ipv6_prefix => "64",
:libvirt__forward_mode => "none",
:libvirt__dhcp_enabled => false
# libvirt__ipv6_address does not work as intended, the address is obtained with the desired prefix, but auto-generated(like fd3c:b398:698:756:5054:ff:fe48:c61e/64)
# add default route for detect ansible_default_ipv6
# TODO: fix libvirt__ipv6 or use $subnet in shell
config.vm.provision "shell", inline: "ip -6 r a fd3c:b398:698:756::/64 dev eth1;ip -6 r add default via fd3c:b398:0698:0756::1 dev eth1 || true"
# Disable swap for each vm
node.vm.provision "shell", inline: "swapoff -a"
@@ -291,9 +336,9 @@ Vagrant.configure("2") do |config|
ansible.tags = [$ansible_tags]
end
ansible.groups = {
"etcd" => ["#{$instance_name_prefix}-[1:#{$etcd_instances}]"],
"kube_control_plane" => ["#{$instance_name_prefix}-[1:#{$kube_master_instances}]"],
"kube_node" => ["#{$instance_name_prefix}-[1:#{$kube_node_instances}]"],
"etcd" => ["#{$instance_name_prefix}-[#{$first_etcd}:#{$etcd_instances + $first_etcd - 1}]"],
"kube_control_plane" => ["#{$instance_name_prefix}-[#{$first_control_plane}:#{$control_plane_instances + $first_control_plane - 1}]"],
"kube_node" => ["#{$instance_name_prefix}-[#{$first_node}:#{$kube_node_instances + $first_node - 1}]"],
"k8s_cluster:children" => ["kube_control_plane", "kube_node"],
}
end

View File

@@ -15,7 +15,7 @@ timeout = 300
stdout_callback = default
display_skipped_hosts = no
library = ./library
callbacks_enabled = profile_tasks,ara_default
callbacks_enabled = profile_tasks
roles_path = roles:$VIRTUAL_ENV/usr/local/share/kubespray/roles:$VIRTUAL_ENV/usr/local/share/ansible/roles:/usr/share/kubespray/roles
deprecation_warnings=False
inventory_ignore_extensions = ~, .orig, .bak, .ini, .cfg, .retry, .pyc, .pyo, .creds, .gpg

9
contrib/collection.sh Executable file
View File

@@ -0,0 +1,9 @@
#!/bin/bash -eux
# Install collection from source assuming dependencies are present.
# Run in SemaphoreUI this bash script can install Kubespray from the repo
NAMESPACE=kubernetes_sigs
COLLECTION=kubespray
MY_VER=$(grep '^version:' galaxy.yml|cut -d: -f2|sed 's/ //')
ansible-galaxy collection build --force --output-path .
ansible-galaxy collection install --offline --force $NAMESPACE-$COLLECTION-$MY_VER.tar.gz

View File

@@ -1,11 +0,0 @@
# Kubespray on KVM Virtual Machines hypervisor preparation
A simple playbook to ensure your system has the right settings to enable Kubespray
deployment on VMs.
This playbook does not create Virtual Machines, nor does it run Kubespray itself.
## User creation
If you want to create a user for running Kubespray deployment, you should specify
both `k8s_deployment_user` and `k8s_deployment_user_pkey_path`.

View File

@@ -1,2 +0,0 @@
#k8s_deployment_user: kubespray
#k8s_deployment_user_pkey_path: /tmp/ssh_rsa

View File

@@ -1,9 +0,0 @@
---
- name: Prepare Hypervisor to later install kubespray VMs
hosts: localhost
gather_facts: false
become: true
vars:
bootstrap_os: none
roles:
- { role: kvm-setup }

View File

@@ -1,30 +0,0 @@
---
- name: Install required packages
package:
name: "{{ item }}"
state: present
with_items:
- bind-utils
- ntp
when: ansible_os_family == "RedHat"
- name: Install required packages
apt:
upgrade: true
update_cache: true
cache_valid_time: 3600
name: "{{ item }}"
state: present
install_recommends: false
with_items:
- dnsutils
- ntp
when: ansible_os_family == "Debian"
- name: Create deployment user if required
include_tasks: user.yml
when: k8s_deployment_user is defined
- name: Set proper sysctl values
import_tasks: sysctl.yml

View File

@@ -1,46 +0,0 @@
---
- name: Load br_netfilter module
community.general.modprobe:
name: br_netfilter
state: present
register: br_netfilter
- name: Add br_netfilter into /etc/modules
lineinfile:
dest: /etc/modules
state: present
line: 'br_netfilter'
when: br_netfilter is defined and ansible_os_family == 'Debian'
- name: Add br_netfilter into /etc/modules-load.d/kubespray.conf
copy:
dest: /etc/modules-load.d/kubespray.conf
content: |-
### This file is managed by Ansible
br-netfilter
owner: root
group: root
mode: "0644"
when: br_netfilter is defined
- name: Enable net.ipv4.ip_forward in sysctl
ansible.posix.sysctl:
name: net.ipv4.ip_forward
value: 1
sysctl_file: "{{ sysctl_file_path }}"
state: present
reload: true
- name: Set bridge-nf-call-{arptables,iptables} to 0
ansible.posix.sysctl:
name: "{{ item }}"
state: present
value: 0
sysctl_file: "{{ sysctl_file_path }}"
reload: true
with_items:
- net.bridge.bridge-nf-call-arptables
- net.bridge.bridge-nf-call-ip6tables
- net.bridge.bridge-nf-call-iptables
when: br_netfilter is defined

View File

@@ -1,47 +0,0 @@
---
- name: Create user {{ k8s_deployment_user }}
user:
name: "{{ k8s_deployment_user }}"
groups: adm
shell: /bin/bash
- name: Ensure that .ssh exists
file:
path: "/home/{{ k8s_deployment_user }}/.ssh"
state: directory
owner: "{{ k8s_deployment_user }}"
group: "{{ k8s_deployment_user }}"
mode: "0700"
- name: Configure sudo for deployment user
copy:
content: |
%{{ k8s_deployment_user }} ALL=(ALL) NOPASSWD: ALL
dest: "/etc/sudoers.d/55-k8s-deployment"
owner: root
group: root
mode: "0644"
- name: Write private SSH key
copy:
src: "{{ k8s_deployment_user_pkey_path }}"
dest: "/home/{{ k8s_deployment_user }}/.ssh/id_rsa"
mode: "0400"
owner: "{{ k8s_deployment_user }}"
group: "{{ k8s_deployment_user }}"
when: k8s_deployment_user_pkey_path is defined
- name: Write public SSH key
shell: "ssh-keygen -y -f /home/{{ k8s_deployment_user }}/.ssh/id_rsa \
> /home/{{ k8s_deployment_user }}/.ssh/authorized_keys"
args:
creates: "/home/{{ k8s_deployment_user }}/.ssh/authorized_keys"
when: k8s_deployment_user_pkey_path is defined
- name: Fix ssh-pub-key permissions
file:
path: "/home/{{ k8s_deployment_user }}/.ssh/authorized_keys"
mode: "0600"
owner: "{{ k8s_deployment_user }}"
group: "{{ k8s_deployment_user }}"
when: k8s_deployment_user_pkey_path is defined

View File

@@ -1,15 +0,0 @@
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kubernetes-dashboard
labels:
k8s-app: kubernetes-dashboard
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kube-system

View File

@@ -1,51 +0,0 @@
---
- name: Check ansible version
import_playbook: kubernetes_sigs.kubespray.ansible_version
- name: Install mitogen
hosts: localhost
strategy: linear
vars:
mitogen_version: 0.3.2
mitogen_url: https://github.com/mitogen-hq/mitogen/archive/refs/tags/v{{ mitogen_version }}.tar.gz
ansible_connection: local
tasks:
- name: Create mitogen plugin dir
file:
path: "{{ item }}"
state: directory
mode: "0755"
become: false
loop:
- "{{ playbook_dir }}/plugins/mitogen"
- "{{ playbook_dir }}/dist"
- name: Download mitogen release
get_url:
url: "{{ mitogen_url }}"
dest: "{{ playbook_dir }}/dist/mitogen_{{ mitogen_version }}.tar.gz"
validate_certs: true
mode: "0644"
- name: Extract archive
unarchive:
src: "{{ playbook_dir }}/dist/mitogen_{{ mitogen_version }}.tar.gz"
dest: "{{ playbook_dir }}/dist/"
- name: Copy plugin
ansible.posix.synchronize:
src: "{{ playbook_dir }}/dist/mitogen-{{ mitogen_version }}/"
dest: "{{ playbook_dir }}/plugins/mitogen"
- name: Add strategy to ansible.cfg
community.general.ini_file:
path: ansible.cfg
mode: "0644"
section: "{{ item.section | d('defaults') }}"
option: "{{ item.option }}"
value: "{{ item.value }}"
with_items:
- option: strategy
value: mitogen_linear
- option: strategy_plugins
value: plugins/mitogen/ansible_mitogen/plugins/strategy

View File

@@ -1,92 +0,0 @@
# Deploying a Kubespray Kubernetes Cluster with GlusterFS
You can either deploy using Ansible on its own by supplying your own inventory file or by using Terraform to create the VMs and then providing a dynamic inventory to Ansible. The following two sections are self-contained, you don't need to go through one to use the other. So, if you want to provision with Terraform, you can skip the **Using an Ansible inventory** section, and if you want to provision with a pre-built ansible inventory, you can neglect the **Using Terraform and Ansible** section.
## Using an Ansible inventory
In the same directory of this ReadMe file you should find a file named `inventory.example` which contains an example setup. Please note that, additionally to the Kubernetes nodes/masters, we define a set of machines for GlusterFS and we add them to the group `[gfs-cluster]`, which in turn is added to the larger `[network-storage]` group as a child group.
Change that file to reflect your local setup (adding more machines or removing them and setting the adequate ip numbers), and save it to `inventory/sample/k8s_gfs_inventory`. Make sure that the settings on `inventory/sample/group_vars/all.yml` make sense with your deployment. Then execute change to the kubespray root folder, and execute (supposing that the machines are all using ubuntu):
```shell
ansible-playbook -b --become-user=root -i inventory/sample/k8s_gfs_inventory --user=ubuntu ./cluster.yml
```
This will provision your Kubernetes cluster. Then, to provision and configure the GlusterFS cluster, from the same directory execute:
```shell
ansible-playbook -b --become-user=root -i inventory/sample/k8s_gfs_inventory --user=ubuntu ./contrib/network-storage/glusterfs/glusterfs.yml
```
If your machines are not using Ubuntu, you need to change the `--user=ubuntu` to the correct user. Alternatively, if your Kubernetes machines are using one OS and your GlusterFS a different one, you can instead specify the `ansible_ssh_user=<correct-user>` variable in the inventory file that you just created, for each machine/VM:
```shell
k8s-master-1 ansible_ssh_host=192.168.0.147 ip=192.168.0.147 ansible_ssh_user=core
k8s-master-node-1 ansible_ssh_host=192.168.0.148 ip=192.168.0.148 ansible_ssh_user=core
k8s-master-node-2 ansible_ssh_host=192.168.0.146 ip=192.168.0.146 ansible_ssh_user=core
```
## Using Terraform and Ansible
First step is to fill in a `my-kubespray-gluster-cluster.tfvars` file with the specification desired for your cluster. An example with all required variables would look like:
```ini
cluster_name = "cluster1"
number_of_k8s_masters = "1"
number_of_k8s_masters_no_floating_ip = "2"
number_of_k8s_nodes_no_floating_ip = "0"
number_of_k8s_nodes = "0"
public_key_path = "~/.ssh/my-desired-key.pub"
image = "Ubuntu 16.04"
ssh_user = "ubuntu"
flavor_k8s_node = "node-flavor-id-in-your-openstack"
flavor_k8s_master = "master-flavor-id-in-your-openstack"
network_name = "k8s-network"
floatingip_pool = "net_external"
# GlusterFS variables
flavor_gfs_node = "gluster-flavor-id-in-your-openstack"
image_gfs = "Ubuntu 16.04"
number_of_gfs_nodes_no_floating_ip = "3"
gfs_volume_size_in_gb = "50"
ssh_user_gfs = "ubuntu"
```
As explained in the general terraform/openstack guide, you need to source your OpenStack credentials file, add your ssh-key to the ssh-agent and setup environment variables for terraform:
```shell
$ source ~/.stackrc
$ eval $(ssh-agent -s)
$ ssh-add ~/.ssh/my-desired-key
$ echo Setting up Terraform creds && \
export TF_VAR_username=${OS_USERNAME} && \
export TF_VAR_password=${OS_PASSWORD} && \
export TF_VAR_tenant=${OS_TENANT_NAME} && \
export TF_VAR_auth_url=${OS_AUTH_URL}
```
Then, standing on the kubespray directory (root base of the Git checkout), issue the following terraform command to create the VMs for the cluster:
```shell
terraform apply -state=contrib/terraform/openstack/terraform.tfstate -var-file=my-kubespray-gluster-cluster.tfvars contrib/terraform/openstack
```
This will create both your Kubernetes and Gluster VMs. Make sure that the ansible file `contrib/terraform/openstack/group_vars/all.yml` includes any ansible variable that you want to setup (like, for instance, the type of machine for bootstrapping).
Then, provision your Kubernetes (kubespray) cluster with the following ansible call:
```shell
ansible-playbook -b --become-user=root -i contrib/terraform/openstack/hosts ./cluster.yml
```
Finally, provision the glusterfs nodes and add the Persistent Volume setup for GlusterFS in Kubernetes through the following ansible call:
```shell
ansible-playbook -b --become-user=root -i contrib/terraform/openstack/hosts ./contrib/network-storage/glusterfs/glusterfs.yml
```
If you need to destroy the cluster, you can run:
```shell
terraform destroy -state=contrib/terraform/openstack/terraform.tfstate -var-file=my-kubespray-gluster-cluster.tfvars contrib/terraform/openstack
```

View File

@@ -1,29 +0,0 @@
---
- name: Bootstrap hosts
hosts: gfs-cluster
gather_facts: false
vars:
ansible_ssh_pipelining: false
roles:
- { role: bootstrap-os, tags: bootstrap-os}
- name: Gather facts
hosts: all
gather_facts: true
- name: Install glusterfs server
hosts: gfs-cluster
vars:
ansible_ssh_pipelining: true
roles:
- { role: glusterfs/server }
- name: Install glusterfs servers
hosts: k8s_cluster
roles:
- { role: glusterfs/client }
- name: Configure Kubernetes to use glusterfs
hosts: kube_control_plane[0]
roles:
- { role: kubernetes-pv }

View File

@@ -1 +0,0 @@
../../../inventory/local/group_vars

View File

@@ -1,43 +0,0 @@
# ## Configure 'ip' variable to bind kubernetes services on a
# ## different ip than the default iface
# node1 ansible_ssh_host=95.54.0.12 # ip=10.3.0.1
# node2 ansible_ssh_host=95.54.0.13 # ip=10.3.0.2
# node3 ansible_ssh_host=95.54.0.14 # ip=10.3.0.3
# node4 ansible_ssh_host=95.54.0.15 # ip=10.3.0.4
# node5 ansible_ssh_host=95.54.0.16 # ip=10.3.0.5
# node6 ansible_ssh_host=95.54.0.17 # ip=10.3.0.6
#
# ## GlusterFS nodes
# ## Set disk_volume_device_1 to desired device for gluster brick, if different to /dev/vdb (default).
# ## As in the previous case, you can set ip to give direct communication on internal IPs
# gfs_node1 ansible_ssh_host=95.54.0.18 # disk_volume_device_1=/dev/vdc ip=10.3.0.7
# gfs_node2 ansible_ssh_host=95.54.0.19 # disk_volume_device_1=/dev/vdc ip=10.3.0.8
# gfs_node3 ansible_ssh_host=95.54.0.20 # disk_volume_device_1=/dev/vdc ip=10.3.0.9
# [kube_control_plane]
# node1
# node2
# [etcd]
# node1
# node2
# node3
# [kube_node]
# node2
# node3
# node4
# node5
# node6
# [k8s_cluster:children]
# kube_node
# kube_control_plane
# [gfs-cluster]
# gfs_node1
# gfs_node2
# gfs_node3
# [network-storage:children]
# gfs-cluster

View File

@@ -1 +0,0 @@
../../../../roles/bootstrap-os

View File

@@ -1,50 +0,0 @@
# Ansible Role: GlusterFS
[![Build Status](https://travis-ci.org/geerlingguy/ansible-role-glusterfs.svg?branch=master)](https://travis-ci.org/geerlingguy/ansible-role-glusterfs)
Installs and configures GlusterFS on Linux.
## Requirements
For GlusterFS to connect between servers, TCP ports `24007`, `24008`, and `24009`/`49152`+ (that port, plus an additional incremented port for each additional server in the cluster; the latter if GlusterFS is version 3.4+), and TCP/UDP port `111` must be open. You can open these using whatever firewall you wish (this can easily be configured using the `geerlingguy.firewall` role).
This role performs basic installation and setup of Gluster, but it does not configure or mount bricks (volumes), since that step is easier to do in a series of plays in your own playbook. Ansible 1.9+ includes the [`gluster_volume`](https://docs.ansible.com/ansible/latest/collections/gluster/gluster/gluster_volume_module.html) module to ease the management of Gluster volumes.
## Role Variables
Available variables are listed below, along with default values (see `defaults/main.yml`):
```yaml
glusterfs_default_release: ""
```
You can specify a `default_release` for apt on Debian/Ubuntu by overriding this variable. This is helpful if you need a different package or version for the main GlusterFS packages (e.g. GlusterFS 3.5.x instead of 3.2.x with the `wheezy-backports` default release on Debian Wheezy).
```yaml
glusterfs_ppa_use: true
glusterfs_ppa_version: "3.5"
```
For Ubuntu, specify whether to use the official Gluster PPA, and which version of the PPA to use. See Gluster's [Getting Started Guide](https://docs.gluster.org/en/latest/Quick-Start-Guide/Quickstart/) for more info.
## Dependencies
None.
## Example Playbook
```yaml
- hosts: server
roles:
- geerlingguy.glusterfs
```
For a real-world use example, read through [Simple GlusterFS Setup with Ansible](http://www.jeffgeerling.com/blog/simple-glusterfs-setup-ansible), a blog post by this role's author, which is included in Chapter 8 of [Ansible for DevOps](https://www.ansiblefordevops.com/).
## License
MIT / BSD
## Author Information
This role was created in 2015 by [Jeff Geerling](http://www.jeffgeerling.com/), author of [Ansible for DevOps](https://www.ansiblefordevops.com/).

View File

@@ -1,11 +0,0 @@
---
# For Ubuntu.
glusterfs_default_release: ""
glusterfs_ppa_use: true
glusterfs_ppa_version: "4.1"
# Gluster configuration.
gluster_mount_dir: /mnt/gluster
gluster_volume_node_mount_dir: /mnt/xfs-drive-gluster
gluster_brick_dir: "{{ gluster_volume_node_mount_dir }}/brick"
gluster_brick_name: gluster

View File

@@ -1,30 +0,0 @@
---
dependencies: []
galaxy_info:
author: geerlingguy
description: GlusterFS installation for Linux.
company: "Midwestern Mac, LLC"
license: "license (BSD, MIT)"
min_ansible_version: "2.0"
platforms:
- name: EL
versions:
- "6"
- "7"
- name: Ubuntu
versions:
- precise
- trusty
- xenial
- name: Debian
versions:
- wheezy
- jessie
galaxy_tags:
- system
- networking
- cloud
- clustering
- files
- sharing

View File

@@ -1,21 +0,0 @@
---
# This is meant for Ubuntu and RedHat installations, where apparently the glusterfs-client is not used from inside
# hyperkube and needs to be installed as part of the system.
# Setup/install tasks.
- name: Setup RedHat distros for glusterfs
include_tasks: setup-RedHat.yml
when: ansible_os_family == 'RedHat' and groups['gfs-cluster'] is defined
- name: Setup Debian distros for glusterfs
include_tasks: setup-Debian.yml
when: ansible_os_family == 'Debian' and groups['gfs-cluster'] is defined
- name: Ensure Gluster mount directories exist.
file:
path: "{{ item }}"
state: directory
mode: "0775"
with_items:
- "{{ gluster_mount_dir }}"
when: ansible_os_family in ["Debian","RedHat"] and groups['gfs-cluster'] is defined

View File

@@ -1,24 +0,0 @@
---
- name: Add PPA for GlusterFS.
apt_repository:
repo: 'ppa:gluster/glusterfs-{{ glusterfs_ppa_version }}'
state: present
update_cache: true
register: glusterfs_ppa_added
when: glusterfs_ppa_use
- name: Ensure GlusterFS client will reinstall if the PPA was just added. # noqa no-handler
apt:
name: "{{ item }}"
state: absent
with_items:
- glusterfs-client
when: glusterfs_ppa_added.changed
- name: Ensure GlusterFS client is installed.
apt:
name: "{{ item }}"
state: present
default_release: "{{ glusterfs_default_release }}"
with_items:
- glusterfs-client

View File

@@ -1,14 +0,0 @@
---
- name: Install Prerequisites
package:
name: "{{ item }}"
state: present
with_items:
- "centos-release-gluster{{ glusterfs_default_release }}"
- name: Install Packages
package:
name: "{{ item }}"
state: present
with_items:
- glusterfs-client

View File

@@ -1,13 +0,0 @@
---
# For Ubuntu.
glusterfs_default_release: ""
glusterfs_ppa_use: true
glusterfs_ppa_version: "3.12"
# Gluster configuration.
gluster_mount_dir: /mnt/gluster
gluster_volume_node_mount_dir: /mnt/xfs-drive-gluster
gluster_brick_dir: "{{ gluster_volume_node_mount_dir }}/brick"
gluster_brick_name: gluster
# Default device to mount for xfs formatting, terraform overrides this by setting the variable in the inventory.
disk_volume_device_1: /dev/vdb

View File

@@ -1,30 +0,0 @@
---
dependencies: []
galaxy_info:
author: geerlingguy
description: GlusterFS installation for Linux.
company: "Midwestern Mac, LLC"
license: "license (BSD, MIT)"
min_ansible_version: "2.0"
platforms:
- name: EL
versions:
- "6"
- "7"
- name: Ubuntu
versions:
- precise
- trusty
- xenial
- name: Debian
versions:
- wheezy
- jessie
galaxy_tags:
- system
- networking
- cloud
- clustering
- files
- sharing

View File

@@ -1,113 +0,0 @@
---
# Include variables and define needed variables.
- name: Include OS-specific variables.
include_vars: "{{ ansible_os_family }}.yml"
# Install xfs package
- name: Install xfs Debian
apt:
name: xfsprogs
state: present
when: ansible_os_family == "Debian"
- name: Install xfs RedHat
package:
name: xfsprogs
state: present
when: ansible_os_family == "RedHat"
# Format external volumes in xfs
- name: Format volumes in xfs
community.general.filesystem:
fstype: xfs
dev: "{{ disk_volume_device_1 }}"
# Mount external volumes
- name: Mounting new xfs filesystem
ansible.posix.mount:
name: "{{ gluster_volume_node_mount_dir }}"
src: "{{ disk_volume_device_1 }}"
fstype: xfs
state: mounted
# Setup/install tasks.
- name: Setup RedHat distros for glusterfs
include_tasks: setup-RedHat.yml
when: ansible_os_family == 'RedHat'
- name: Setup Debian distros for glusterfs
include_tasks: setup-Debian.yml
when: ansible_os_family == 'Debian'
- name: Ensure GlusterFS is started and enabled at boot.
service:
name: "{{ glusterfs_daemon }}"
state: started
enabled: true
- name: Ensure Gluster brick and mount directories exist.
file:
path: "{{ item }}"
state: directory
mode: "0775"
with_items:
- "{{ gluster_brick_dir }}"
- "{{ gluster_mount_dir }}"
- name: Configure Gluster volume with replicas
gluster.gluster.gluster_volume:
state: present
name: "{{ gluster_brick_name }}"
brick: "{{ gluster_brick_dir }}"
replicas: "{{ groups['gfs-cluster'] | length }}"
cluster: "{% for item in groups['gfs-cluster'] -%}{{ hostvars[item]['ip'] | default(hostvars[item].ansible_default_ipv4['address']) }}{% if not loop.last %},{% endif %}{%- endfor %}"
host: "{{ inventory_hostname }}"
force: true
run_once: true
when: groups['gfs-cluster'] | length > 1
- name: Configure Gluster volume without replicas
gluster.gluster.gluster_volume:
state: present
name: "{{ gluster_brick_name }}"
brick: "{{ gluster_brick_dir }}"
cluster: "{% for item in groups['gfs-cluster'] -%}{{ hostvars[item]['ip'] | default(hostvars[item].ansible_default_ipv4['address']) }}{% if not loop.last %},{% endif %}{%- endfor %}"
host: "{{ inventory_hostname }}"
force: true
run_once: true
when: groups['gfs-cluster'] | length <= 1
- name: Mount glusterfs to retrieve disk size
ansible.posix.mount:
name: "{{ gluster_mount_dir }}"
src: "{{ ip | default(ansible_default_ipv4['address']) }}:/gluster"
fstype: glusterfs
opts: "defaults,_netdev"
state: mounted
when: groups['gfs-cluster'] is defined and inventory_hostname == groups['gfs-cluster'][0]
- name: Get Gluster disk size
setup:
filter: ansible_mounts
register: mounts_data
when: groups['gfs-cluster'] is defined and inventory_hostname == groups['gfs-cluster'][0]
- name: Set Gluster disk size to variable
set_fact:
gluster_disk_size_gb: "{{ (mounts_data.ansible_facts.ansible_mounts | selectattr('mount', 'equalto', gluster_mount_dir) | map(attribute='size_total') | first | int / (1024 * 1024 * 1024)) | int }}"
when: groups['gfs-cluster'] is defined and inventory_hostname == groups['gfs-cluster'][0]
- name: Create file on GlusterFS
template:
dest: "{{ gluster_mount_dir }}/.test-file.txt"
src: test-file.txt
mode: "0644"
when: groups['gfs-cluster'] is defined and inventory_hostname == groups['gfs-cluster'][0]
- name: Unmount glusterfs
ansible.posix.mount:
name: "{{ gluster_mount_dir }}"
fstype: glusterfs
src: "{{ ip | default(ansible_default_ipv4['address']) }}:/gluster"
state: unmounted
when: groups['gfs-cluster'] is defined and inventory_hostname == groups['gfs-cluster'][0]

View File

@@ -1,26 +0,0 @@
---
- name: Add PPA for GlusterFS.
apt_repository:
repo: 'ppa:gluster/glusterfs-{{ glusterfs_ppa_version }}'
state: present
update_cache: true
register: glusterfs_ppa_added
when: glusterfs_ppa_use
- name: Ensure GlusterFS will reinstall if the PPA was just added. # noqa no-handler
apt:
name: "{{ item }}"
state: absent
with_items:
- glusterfs-server
- glusterfs-client
when: glusterfs_ppa_added.changed
- name: Ensure GlusterFS is installed.
apt:
name: "{{ item }}"
state: present
default_release: "{{ glusterfs_default_release }}"
with_items:
- glusterfs-server
- glusterfs-client

View File

@@ -1,15 +0,0 @@
---
- name: Install Prerequisites
package:
name: "{{ item }}"
state: present
with_items:
- "centos-release-gluster{{ glusterfs_default_release }}"
- name: Install Packages
package:
name: "{{ item }}"
state: present
with_items:
- glusterfs-server
- glusterfs-client

View File

@@ -1,2 +0,0 @@
---
glusterfs_daemon: glusterd

View File

@@ -1,2 +0,0 @@
---
glusterfs_daemon: glusterd

View File

@@ -1,23 +0,0 @@
---
- name: Kubernetes Apps | Lay Down k8s GlusterFS Endpoint and PV
template:
src: "{{ item.file }}"
dest: "{{ kube_config_dir }}/{{ item.dest }}"
mode: "0644"
with_items:
- { file: glusterfs-kubernetes-endpoint.json.j2, type: ep, dest: glusterfs-kubernetes-endpoint.json}
- { file: glusterfs-kubernetes-pv.yml.j2, type: pv, dest: glusterfs-kubernetes-pv.yml}
- { file: glusterfs-kubernetes-endpoint-svc.json.j2, type: svc, dest: glusterfs-kubernetes-endpoint-svc.json}
register: gluster_pv
when: inventory_hostname == groups['kube_control_plane'][0] and groups['gfs-cluster'] is defined and hostvars[groups['gfs-cluster'][0]].gluster_disk_size_gb is defined
- name: Kubernetes Apps | Set GlusterFS endpoint and PV
kube:
name: glusterfs
namespace: default
kubectl: "{{ bin_dir }}/kubectl"
resource: "{{ item.item.type }}"
filename: "{{ kube_config_dir }}/{{ item.item.dest }}"
state: "{{ item.changed | ternary('latest', 'present') }}"
with_items: "{{ gluster_pv.results }}"
when: inventory_hostname == groups['kube_control_plane'][0] and groups['gfs-cluster'] is defined

View File

@@ -1,12 +0,0 @@
{
"kind": "Service",
"apiVersion": "v1",
"metadata": {
"name": "glusterfs"
},
"spec": {
"ports": [
{"port": 1}
]
}
}

View File

@@ -1,23 +0,0 @@
{
"kind": "Endpoints",
"apiVersion": "v1",
"metadata": {
"name": "glusterfs"
},
"subsets": [
{% for host in groups['gfs-cluster'] %}
{
"addresses": [
{
"ip": "{{hostvars[host]['ip']|default(hostvars[host].ansible_default_ipv4['address'])}}"
}
],
"ports": [
{
"port": 1
}
]
}{%- if not loop.last %}, {% endif -%}
{% endfor %}
]
}

View File

@@ -1,14 +0,0 @@
apiVersion: v1
kind: PersistentVolume
metadata:
name: glusterfs
spec:
capacity:
storage: "{{ hostvars[groups['gfs-cluster'][0]].gluster_disk_size_gb }}Gi"
accessModes:
- ReadWriteMany
glusterfs:
endpoints: glusterfs
path: gluster
readOnly: false
persistentVolumeReclaimPolicy: Retain

View File

@@ -1,3 +0,0 @@
---
dependencies:
- {role: kubernetes-pv/ansible, tags: apps}

View File

@@ -1,27 +0,0 @@
# Deploy Heketi/Glusterfs into Kubespray/Kubernetes
This playbook aims to automate [this](https://github.com/heketi/heketi/blob/master/docs/admin/install-kubernetes.md) tutorial. It deploys heketi/glusterfs into kubernetes and sets up a storageclass.
## Important notice
> Due to resource limits on the current project maintainers and general lack of contributions we are considering placing Heketi into a [near-maintenance mode](https://github.com/heketi/heketi#important-notice)
## Client Setup
Heketi provides a CLI that provides users with a means to administer the deployment and configuration of GlusterFS in Kubernetes. [Download and install the heketi-cli](https://github.com/heketi/heketi/releases) on your client machine.
## Install
Copy the inventory.yml.sample over to inventory/sample/k8s_heketi_inventory.yml and change it according to your setup.
```shell
ansible-playbook --ask-become -i inventory/sample/k8s_heketi_inventory.yml contrib/network-storage/heketi/heketi.yml
```
## Tear down
```shell
ansible-playbook --ask-become -i inventory/sample/k8s_heketi_inventory.yml contrib/network-storage/heketi/heketi-tear-down.yml
```
Add `--extra-vars "heketi_remove_lvm=true"` to the command above to remove LVM packages from the system

View File

@@ -1,11 +0,0 @@
---
- name: Tear down heketi
hosts: kube_control_plane[0]
roles:
- { role: tear-down }
- name: Teardown disks in heketi
hosts: heketi-node
become: true
roles:
- { role: tear-down-disks }

View File

@@ -1,12 +0,0 @@
---
- name: Prepare heketi install
hosts: heketi-node
roles:
- { role: prepare }
- name: Provision heketi
hosts: kube_control_plane[0]
tags:
- "provision"
roles:
- { role: provision }

View File

@@ -1,33 +0,0 @@
all:
vars:
heketi_admin_key: "11elfeinhundertundelf"
heketi_user_key: "!!einseinseins"
glusterfs_daemonset:
readiness_probe:
timeout_seconds: 3
initial_delay_seconds: 3
liveness_probe:
timeout_seconds: 3
initial_delay_seconds: 10
children:
k8s_cluster:
vars:
kubelet_fail_swap_on: false
children:
kube_control_plane:
hosts:
node1:
etcd:
hosts:
node2:
kube_node:
hosts: &kube_nodes
node1:
node2:
node3:
node4:
heketi-node:
vars:
disk_volume_device_1: "/dev/vdb"
hosts:
<<: *kube_nodes

View File

@@ -1 +0,0 @@
jmespath

View File

@@ -1,24 +0,0 @@
---
- name: "Load lvm kernel modules"
become: true
with_items:
- "dm_snapshot"
- "dm_mirror"
- "dm_thin_pool"
community.general.modprobe:
name: "{{ item }}"
state: "present"
- name: "Install glusterfs mount utils (RedHat)"
become: true
package:
name: "glusterfs-fuse"
state: "present"
when: "ansible_os_family == 'RedHat'"
- name: "Install glusterfs mount utils (Debian)"
become: true
apt:
name: "glusterfs-client"
state: "present"
when: "ansible_os_family == 'Debian'"

View File

@@ -1,3 +0,0 @@
---
- name: "Stop port forwarding"
command: "killall "

View File

@@ -1,64 +0,0 @@
---
# Bootstrap heketi
- name: "Get state of heketi service, deployment and pods."
register: "initial_heketi_state"
changed_when: false
command: "{{ bin_dir }}/kubectl get services,deployments,pods --selector=deploy-heketi --output=json"
- name: "Bootstrap heketi."
when:
- "(initial_heketi_state.stdout | from_json | json_query(\"items[?kind=='Service']\")) | length == 0"
- "(initial_heketi_state.stdout | from_json | json_query(\"items[?kind=='Deployment']\")) | length == 0"
- "(initial_heketi_state.stdout | from_json | json_query(\"items[?kind=='Pod']\")) | length == 0"
include_tasks: "bootstrap/deploy.yml"
# Prepare heketi topology
- name: "Get heketi initial pod state."
register: "initial_heketi_pod"
command: "{{ bin_dir }}/kubectl get pods --selector=deploy-heketi=pod,glusterfs=heketi-pod,name=deploy-heketi --output=json"
changed_when: false
- name: "Ensure heketi bootstrap pod is up."
assert:
that: "(initial_heketi_pod.stdout | from_json | json_query('items[*]')) | length == 1"
- name: Store the initial heketi pod name
set_fact:
initial_heketi_pod_name: "{{ initial_heketi_pod.stdout | from_json | json_query(\"items[*].metadata.name | [0]\") }}"
- name: "Test heketi topology."
changed_when: false
register: "heketi_topology"
command: "{{ bin_dir }}/kubectl exec {{ initial_heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} topology info --json"
- name: "Load heketi topology."
when: "heketi_topology.stdout | from_json | json_query(\"clusters[*].nodes[*]\") | flatten | length == 0"
include_tasks: "bootstrap/topology.yml"
# Provision heketi database volume
- name: "Prepare heketi volumes."
include_tasks: "bootstrap/volumes.yml"
# Remove bootstrap heketi
- name: "Tear down bootstrap."
include_tasks: "bootstrap/tear-down.yml"
# Prepare heketi storage
- name: "Test heketi storage."
command: "{{ bin_dir }}/kubectl get secrets,endpoints,services,jobs --output=json"
changed_when: false
register: "heketi_storage_state"
# ensure endpoints actually exist before trying to move database data to it
- name: "Create heketi storage."
include_tasks: "bootstrap/storage.yml"
vars:
secret_query: "items[?metadata.name=='heketi-storage-secret' && kind=='Secret']"
endpoints_query: "items[?metadata.name=='heketi-storage-endpoints' && kind=='Endpoints']"
service_query: "items[?metadata.name=='heketi-storage-endpoints' && kind=='Service']"
job_query: "items[?metadata.name=='heketi-storage-copy-job' && kind=='Job']"
when:
- "heketi_storage_state.stdout | from_json | json_query(secret_query) | length == 0"
- "heketi_storage_state.stdout | from_json | json_query(endpoints_query) | length == 0"
- "heketi_storage_state.stdout | from_json | json_query(service_query) | length == 0"
- "heketi_storage_state.stdout | from_json | json_query(job_query) | length == 0"

View File

@@ -1,27 +0,0 @@
---
- name: "Kubernetes Apps | Lay Down Heketi Bootstrap"
become: true
template:
src: "heketi-bootstrap.json.j2"
dest: "{{ kube_config_dir }}/heketi-bootstrap.json"
mode: "0640"
register: "rendering"
- name: "Kubernetes Apps | Install and configure Heketi Bootstrap"
kube:
name: "GlusterFS"
kubectl: "{{ bin_dir }}/kubectl"
filename: "{{ kube_config_dir }}/heketi-bootstrap.json"
state: "{{ rendering.changed | ternary('latest', 'present') }}"
- name: "Wait for heketi bootstrap to complete."
changed_when: false
register: "initial_heketi_state"
vars:
initial_heketi_state: { stdout: "{}" }
pods_query: "items[?kind=='Pod'].status.conditions | [0][?type=='Ready'].status | [0]"
deployments_query: "items[?kind=='Deployment'].status.conditions | [0][?type=='Available'].status | [0]"
command: "{{ bin_dir }}/kubectl get services,deployments,pods --selector=deploy-heketi --output=json"
until:
- "initial_heketi_state.stdout | from_json | json_query(pods_query) == 'True'"
- "initial_heketi_state.stdout | from_json | json_query(deployments_query) == 'True'"
retries: 60
delay: 5

View File

@@ -1,33 +0,0 @@
---
- name: "Test heketi storage."
command: "{{ bin_dir }}/kubectl get secrets,endpoints,services,jobs --output=json"
changed_when: false
register: "heketi_storage_state"
- name: "Create heketi storage."
kube:
name: "GlusterFS"
kubectl: "{{ bin_dir }}/kubectl"
filename: "{{ kube_config_dir }}/heketi-storage-bootstrap.json"
state: "present"
vars:
secret_query: "items[?metadata.name=='heketi-storage-secret' && kind=='Secret']"
endpoints_query: "items[?metadata.name=='heketi-storage-endpoints' && kind=='Endpoints']"
service_query: "items[?metadata.name=='heketi-storage-endpoints' && kind=='Service']"
job_query: "items[?metadata.name=='heketi-storage-copy-job' && kind=='Job']"
when:
- "heketi_storage_state.stdout | from_json | json_query(secret_query) | length == 0"
- "heketi_storage_state.stdout | from_json | json_query(endpoints_query) | length == 0"
- "heketi_storage_state.stdout | from_json | json_query(service_query) | length == 0"
- "heketi_storage_state.stdout | from_json | json_query(job_query) | length == 0"
register: "heketi_storage_result"
- name: "Get state of heketi database copy job."
command: "{{ bin_dir }}/kubectl get jobs --output=json"
changed_when: false
register: "heketi_storage_state"
vars:
heketi_storage_state: { stdout: "{}" }
job_query: "items[?metadata.name=='heketi-storage-copy-job' && kind=='Job' && status.succeeded==1]"
until:
- "heketi_storage_state.stdout | from_json | json_query(job_query) | length == 1"
retries: 60
delay: 5

View File

@@ -1,14 +0,0 @@
---
- name: "Get existing Heketi deploy resources."
command: "{{ bin_dir }}/kubectl get all --selector=\"deploy-heketi\" -o=json"
register: "heketi_resources"
changed_when: false
- name: "Delete bootstrap Heketi."
command: "{{ bin_dir }}/kubectl delete all,service,jobs,deployment,secret --selector=\"deploy-heketi\""
when: "heketi_resources.stdout | from_json | json_query('items[*]') | length > 0"
- name: "Ensure there is nothing left over."
command: "{{ bin_dir }}/kubectl get all,service,jobs,deployment,secret --selector=\"deploy-heketi\" -o=json"
register: "heketi_result"
until: "heketi_result.stdout | from_json | json_query('items[*]') | length == 0"
retries: 60
delay: 5

View File

@@ -1,27 +0,0 @@
---
- name: "Get heketi topology."
changed_when: false
register: "heketi_topology"
command: "{{ bin_dir }}/kubectl exec {{ initial_heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} topology info --json"
- name: "Render heketi topology template."
become: true
vars: { nodes: "{{ groups['heketi-node'] }}" }
register: "render"
template:
src: "topology.json.j2"
dest: "{{ kube_config_dir }}/topology.json"
mode: "0644"
- name: "Copy topology configuration into container."
changed_when: false
command: "{{ bin_dir }}/kubectl cp {{ kube_config_dir }}/topology.json {{ initial_heketi_pod_name }}:/tmp/topology.json"
- name: "Load heketi topology." # noqa no-handler
when: "render.changed"
command: "{{ bin_dir }}/kubectl exec {{ initial_heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} topology load --json=/tmp/topology.json"
register: "load_heketi"
- name: "Get heketi topology."
changed_when: false
register: "heketi_topology"
command: "{{ bin_dir }}/kubectl exec {{ initial_heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} topology info --json"
until: "heketi_topology.stdout | from_json | json_query(\"clusters[*].nodes[*].devices[?state=='online'].id\") | flatten | length == groups['heketi-node'] | length"
retries: 60
delay: 5

View File

@@ -1,41 +0,0 @@
---
- name: "Get heketi volume ids."
command: "{{ bin_dir }}/kubectl exec {{ initial_heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} volume list --json"
changed_when: false
register: "heketi_volumes"
- name: "Get heketi volumes."
changed_when: false
command: "{{ bin_dir }}/kubectl exec {{ initial_heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} volume info {{ volume_id }} --json"
with_items: "{{ heketi_volumes.stdout | from_json | json_query(\"volumes[*]\") }}"
loop_control: { loop_var: "volume_id" }
register: "volumes_information"
- name: "Test heketi database volume."
set_fact: { heketi_database_volume_exists: true }
with_items: "{{ volumes_information.results }}"
loop_control: { loop_var: "volume_information" }
vars: { volume: "{{ volume_information.stdout | from_json }}" }
when: "volume.name == 'heketidbstorage'"
- name: "Provision database volume."
command: "{{ bin_dir }}/kubectl exec {{ initial_heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} setup-openshift-heketi-storage"
when: "heketi_database_volume_exists is undefined"
- name: "Copy configuration from pod."
become: true
command: "{{ bin_dir }}/kubectl cp {{ initial_heketi_pod_name }}:/heketi-storage.json {{ kube_config_dir }}/heketi-storage-bootstrap.json"
- name: "Get heketi volume ids."
command: "{{ bin_dir }}/kubectl exec {{ initial_heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} volume list --json"
changed_when: false
register: "heketi_volumes"
- name: "Get heketi volumes."
changed_when: false
command: "{{ bin_dir }}/kubectl exec {{ initial_heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} volume info {{ volume_id }} --json"
with_items: "{{ heketi_volumes.stdout | from_json | json_query(\"volumes[*]\") }}"
loop_control: { loop_var: "volume_id" }
register: "volumes_information"
- name: "Test heketi database volume."
set_fact: { heketi_database_volume_created: true }
with_items: "{{ volumes_information.results }}"
loop_control: { loop_var: "volume_information" }
vars: { volume: "{{ volume_information.stdout | from_json }}" }
when: "volume.name == 'heketidbstorage'"
- name: "Ensure heketi database volume exists."
assert: { that: "heketi_database_volume_created is defined", msg: "Heketi database volume does not exist." }

View File

@@ -1,4 +0,0 @@
---
- name: "Clean up left over jobs."
command: "{{ bin_dir }}/kubectl delete jobs,pods --selector=\"deploy-heketi\""
changed_when: false

View File

@@ -1,44 +0,0 @@
---
- name: "Kubernetes Apps | Lay Down GlusterFS Daemonset"
template:
src: "glusterfs-daemonset.json.j2"
dest: "{{ kube_config_dir }}/glusterfs-daemonset.json"
mode: "0644"
become: true
register: "rendering"
- name: "Kubernetes Apps | Install and configure GlusterFS daemonset"
kube:
name: "GlusterFS"
kubectl: "{{ bin_dir }}/kubectl"
filename: "{{ kube_config_dir }}/glusterfs-daemonset.json"
state: "{{ rendering.changed | ternary('latest', 'present') }}"
- name: "Kubernetes Apps | Label GlusterFS nodes"
include_tasks: "glusterfs/label.yml"
with_items: "{{ groups['heketi-node'] }}"
loop_control:
loop_var: "node"
- name: "Kubernetes Apps | Wait for daemonset to become available."
register: "daemonset_state"
command: "{{ bin_dir }}/kubectl get daemonset glusterfs --output=json --ignore-not-found=true"
changed_when: false
vars:
daemonset_state: { stdout: "{}" }
ready: "{{ daemonset_state.stdout | from_json | json_query(\"status.numberReady\") }}"
desired: "{{ daemonset_state.stdout | from_json | json_query(\"status.desiredNumberScheduled\") }}"
until: "ready | int >= 3"
retries: 60
delay: 5
- name: "Kubernetes Apps | Lay Down Heketi Service Account"
template:
src: "heketi-service-account.json.j2"
dest: "{{ kube_config_dir }}/heketi-service-account.json"
mode: "0644"
become: true
register: "rendering"
- name: "Kubernetes Apps | Install and configure Heketi Service Account"
kube:
name: "GlusterFS"
kubectl: "{{ bin_dir }}/kubectl"
filename: "{{ kube_config_dir }}/heketi-service-account.json"
state: "{{ rendering.changed | ternary('latest', 'present') }}"

View File

@@ -1,19 +0,0 @@
---
- name: Get storage nodes
register: "label_present"
command: "{{ bin_dir }}/kubectl get node --selector=storagenode=glusterfs,kubernetes.io/hostname={{ node }} --ignore-not-found=true"
changed_when: false
- name: "Assign storage label"
when: "label_present.stdout_lines | length == 0"
command: "{{ bin_dir }}/kubectl label node {{ node }} storagenode=glusterfs"
- name: Get storage nodes again
register: "label_present"
command: "{{ bin_dir }}/kubectl get node --selector=storagenode=glusterfs,kubernetes.io/hostname={{ node }} --ignore-not-found=true"
changed_when: false
- name: Ensure the label has been set
assert:
that: "label_present | length > 0"
msg: "Node {{ node }} has not been assigned with label storagenode=glusterfs."

View File

@@ -1,34 +0,0 @@
---
- name: "Kubernetes Apps | Lay Down Heketi"
become: true
template:
src: "heketi-deployment.json.j2"
dest: "{{ kube_config_dir }}/heketi-deployment.json"
mode: "0644"
register: "rendering"
- name: "Kubernetes Apps | Install and configure Heketi"
kube:
name: "GlusterFS"
kubectl: "{{ bin_dir }}/kubectl"
filename: "{{ kube_config_dir }}/heketi-deployment.json"
state: "{{ rendering.changed | ternary('latest', 'present') }}"
- name: "Ensure heketi is up and running."
changed_when: false
register: "heketi_state"
vars:
heketi_state:
stdout: "{}"
pods_query: "items[?kind=='Pod'].status.conditions|[0][?type=='Ready'].status|[0]"
deployments_query: "items[?kind=='Deployment'].status.conditions|[0][?type=='Available'].status|[0]"
command: "{{ bin_dir }}/kubectl get deployments,pods --selector=glusterfs --output=json"
until:
- "heketi_state.stdout | from_json | json_query(pods_query) == 'True'"
- "heketi_state.stdout | from_json | json_query(deployments_query) == 'True'"
retries: 60
delay: 5
- name: Set the Heketi pod name
set_fact:
heketi_pod_name: "{{ heketi_state.stdout | from_json | json_query(\"items[?kind=='Pod'].metadata.name|[0]\") }}"

View File

@@ -1,30 +0,0 @@
---
- name: "Kubernetes Apps | GlusterFS"
include_tasks: "glusterfs.yml"
- name: "Kubernetes Apps | Heketi Secrets"
include_tasks: "secret.yml"
- name: "Kubernetes Apps | Test Heketi"
register: "heketi_service_state"
command: "{{ bin_dir }}/kubectl get service heketi-storage-endpoints -o=name --ignore-not-found=true"
changed_when: false
- name: "Kubernetes Apps | Bootstrap Heketi"
when: "heketi_service_state.stdout == \"\""
include_tasks: "bootstrap.yml"
- name: "Kubernetes Apps | Heketi"
include_tasks: "heketi.yml"
- name: "Kubernetes Apps | Heketi Topology"
include_tasks: "topology.yml"
- name: "Kubernetes Apps | Heketi Storage"
include_tasks: "storage.yml"
- name: "Kubernetes Apps | Storage Class"
include_tasks: "storageclass.yml"
- name: "Clean up"
include_tasks: "cleanup.yml"

View File

@@ -1,45 +0,0 @@
---
- name: Get clusterrolebindings
register: "clusterrolebinding_state"
command: "{{ bin_dir }}/kubectl get clusterrolebinding heketi-gluster-admin -o=name --ignore-not-found=true"
changed_when: false
- name: "Kubernetes Apps | Deploy cluster role binding."
when: "clusterrolebinding_state.stdout | length == 0"
command: "{{ bin_dir }}/kubectl create clusterrolebinding heketi-gluster-admin --clusterrole=edit --serviceaccount=default:heketi-service-account"
- name: Get clusterrolebindings again
register: "clusterrolebinding_state"
command: "{{ bin_dir }}/kubectl get clusterrolebinding heketi-gluster-admin -o=name --ignore-not-found=true"
changed_when: false
- name: Make sure that clusterrolebindings are present now
assert:
that: "clusterrolebinding_state.stdout | length > 0"
msg: "Cluster role binding is not present."
- name: Get the heketi-config-secret secret
register: "secret_state"
command: "{{ bin_dir }}/kubectl get secret heketi-config-secret -o=name --ignore-not-found=true"
changed_when: false
- name: "Render Heketi secret configuration."
become: true
template:
src: "heketi.json.j2"
dest: "{{ kube_config_dir }}/heketi.json"
mode: "0644"
- name: "Deploy Heketi config secret"
when: "secret_state.stdout | length == 0"
command: "{{ bin_dir }}/kubectl create secret generic heketi-config-secret --from-file={{ kube_config_dir }}/heketi.json"
- name: Get the heketi-config-secret secret again
register: "secret_state"
command: "{{ bin_dir }}/kubectl get secret heketi-config-secret -o=name --ignore-not-found=true"
changed_when: false
- name: Make sure the heketi-config-secret secret exists now
assert:
that: "secret_state.stdout | length > 0"
msg: "Heketi config secret is not present."

View File

@@ -1,15 +0,0 @@
---
- name: "Kubernetes Apps | Lay Down Heketi Storage"
become: true
vars: { nodes: "{{ groups['heketi-node'] }}" }
template:
src: "heketi-storage.json.j2"
dest: "{{ kube_config_dir }}/heketi-storage.json"
mode: "0644"
register: "rendering"
- name: "Kubernetes Apps | Install and configure Heketi Storage"
kube:
name: "GlusterFS"
kubectl: "{{ bin_dir }}/kubectl"
filename: "{{ kube_config_dir }}/heketi-storage.json"
state: "{{ rendering.changed | ternary('latest', 'present') }}"

View File

@@ -1,26 +0,0 @@
---
- name: "Test storage class."
command: "{{ bin_dir }}/kubectl get storageclass gluster --ignore-not-found=true --output=json"
register: "storageclass"
changed_when: false
- name: "Test heketi service."
command: "{{ bin_dir }}/kubectl get service heketi --ignore-not-found=true --output=json"
register: "heketi_service"
changed_when: false
- name: "Ensure heketi service is available."
assert: { that: "heketi_service.stdout != \"\"" }
- name: "Render storage class configuration."
become: true
vars:
endpoint_address: "{{ (heketi_service.stdout | from_json).spec.clusterIP }}"
template:
src: "storageclass.yml.j2"
dest: "{{ kube_config_dir }}/storageclass.yml"
mode: "0644"
register: "rendering"
- name: "Kubernetes Apps | Install and configure Storace Class"
kube:
name: "GlusterFS"
kubectl: "{{ bin_dir }}/kubectl"
filename: "{{ kube_config_dir }}/storageclass.yml"
state: "{{ rendering.changed | ternary('latest', 'present') }}"

View File

@@ -1,26 +0,0 @@
---
- name: "Get heketi topology."
register: "heketi_topology"
changed_when: false
command: "{{ bin_dir }}/kubectl exec {{ heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} topology info --json"
- name: "Render heketi topology template."
become: true
vars: { nodes: "{{ groups['heketi-node'] }}" }
register: "rendering"
template:
src: "topology.json.j2"
dest: "{{ kube_config_dir }}/topology.json"
mode: "0644"
- name: "Copy topology configuration into container." # noqa no-handler
when: "rendering.changed"
command: "{{ bin_dir }}/kubectl cp {{ kube_config_dir }}/topology.json {{ heketi_pod_name }}:/tmp/topology.json"
- name: "Load heketi topology." # noqa no-handler
when: "rendering.changed"
command: "{{ bin_dir }}/kubectl exec {{ heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} topology load --json=/tmp/topology.json"
- name: "Get heketi topology."
register: "heketi_topology"
changed_when: false
command: "{{ bin_dir }}/kubectl exec {{ heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} topology info --json"
until: "heketi_topology.stdout | from_json | json_query(\"clusters[*].nodes[*].devices[?state=='online'].id\") | flatten | length == groups['heketi-node'] | length"
retries: 60
delay: 5

View File

@@ -1,149 +0,0 @@
{
"kind": "DaemonSet",
"apiVersion": "apps/v1",
"metadata": {
"name": "glusterfs",
"labels": {
"glusterfs": "deployment"
},
"annotations": {
"description": "GlusterFS Daemon Set",
"tags": "glusterfs"
}
},
"spec": {
"selector": {
"matchLabels": {
"glusterfs-node": "daemonset"
}
},
"template": {
"metadata": {
"name": "glusterfs",
"labels": {
"glusterfs-node": "daemonset"
}
},
"spec": {
"nodeSelector": {
"storagenode" : "glusterfs"
},
"hostNetwork": true,
"containers": [
{
"image": "gluster/gluster-centos:gluster4u0_centos7",
"imagePullPolicy": "IfNotPresent",
"name": "glusterfs",
"volumeMounts": [
{
"name": "glusterfs-heketi",
"mountPath": "/var/lib/heketi"
},
{
"name": "glusterfs-run",
"mountPath": "/run"
},
{
"name": "glusterfs-lvm",
"mountPath": "/run/lvm"
},
{
"name": "glusterfs-etc",
"mountPath": "/etc/glusterfs"
},
{
"name": "glusterfs-logs",
"mountPath": "/var/log/glusterfs"
},
{
"name": "glusterfs-config",
"mountPath": "/var/lib/glusterd"
},
{
"name": "glusterfs-dev",
"mountPath": "/dev"
},
{
"name": "glusterfs-cgroup",
"mountPath": "/sys/fs/cgroup"
}
],
"securityContext": {
"capabilities": {},
"privileged": true
},
"readinessProbe": {
"timeoutSeconds": {{ glusterfs_daemonset.readiness_probe.timeout_seconds }},
"initialDelaySeconds": {{ glusterfs_daemonset.readiness_probe.initial_delay_seconds }},
"exec": {
"command": [
"/bin/bash",
"-c",
"systemctl status glusterd.service"
]
}
},
"livenessProbe": {
"timeoutSeconds": {{ glusterfs_daemonset.liveness_probe.timeout_seconds }},
"initialDelaySeconds": {{ glusterfs_daemonset.liveness_probe.initial_delay_seconds }},
"exec": {
"command": [
"/bin/bash",
"-c",
"systemctl status glusterd.service"
]
}
}
}
],
"volumes": [
{
"name": "glusterfs-heketi",
"hostPath": {
"path": "/var/lib/heketi"
}
},
{
"name": "glusterfs-run"
},
{
"name": "glusterfs-lvm",
"hostPath": {
"path": "/run/lvm"
}
},
{
"name": "glusterfs-etc",
"hostPath": {
"path": "/etc/glusterfs"
}
},
{
"name": "glusterfs-logs",
"hostPath": {
"path": "/var/log/glusterfs"
}
},
{
"name": "glusterfs-config",
"hostPath": {
"path": "/var/lib/glusterd"
}
},
{
"name": "glusterfs-dev",
"hostPath": {
"path": "/dev"
}
},
{
"name": "glusterfs-cgroup",
"hostPath": {
"path": "/sys/fs/cgroup"
}
}
]
}
}
}
}

View File

@@ -1,138 +0,0 @@
{
"kind": "List",
"apiVersion": "v1",
"items": [
{
"kind": "Service",
"apiVersion": "v1",
"metadata": {
"name": "deploy-heketi",
"labels": {
"glusterfs": "heketi-service",
"deploy-heketi": "support"
},
"annotations": {
"description": "Exposes Heketi Service"
}
},
"spec": {
"selector": {
"name": "deploy-heketi"
},
"ports": [
{
"name": "deploy-heketi",
"port": 8080,
"targetPort": 8080
}
]
}
},
{
"kind": "Deployment",
"apiVersion": "apps/v1",
"metadata": {
"name": "deploy-heketi",
"labels": {
"glusterfs": "heketi-deployment",
"deploy-heketi": "deployment"
},
"annotations": {
"description": "Defines how to deploy Heketi"
}
},
"spec": {
"selector": {
"matchLabels": {
"name": "deploy-heketi"
}
},
"replicas": 1,
"template": {
"metadata": {
"name": "deploy-heketi",
"labels": {
"name": "deploy-heketi",
"glusterfs": "heketi-pod",
"deploy-heketi": "pod"
}
},
"spec": {
"serviceAccountName": "heketi-service-account",
"containers": [
{
"image": "heketi/heketi:9",
"imagePullPolicy": "Always",
"name": "deploy-heketi",
"env": [
{
"name": "HEKETI_EXECUTOR",
"value": "kubernetes"
},
{
"name": "HEKETI_DB_PATH",
"value": "/var/lib/heketi/heketi.db"
},
{
"name": "HEKETI_FSTAB",
"value": "/var/lib/heketi/fstab"
},
{
"name": "HEKETI_SNAPSHOT_LIMIT",
"value": "14"
},
{
"name": "HEKETI_KUBE_GLUSTER_DAEMONSET",
"value": "y"
}
],
"ports": [
{
"containerPort": 8080
}
],
"volumeMounts": [
{
"name": "db",
"mountPath": "/var/lib/heketi"
},
{
"name": "config",
"mountPath": "/etc/heketi"
}
],
"readinessProbe": {
"timeoutSeconds": 3,
"initialDelaySeconds": 3,
"httpGet": {
"path": "/hello",
"port": 8080
}
},
"livenessProbe": {
"timeoutSeconds": 3,
"initialDelaySeconds": 10,
"httpGet": {
"path": "/hello",
"port": 8080
}
}
}
],
"volumes": [
{
"name": "db"
},
{
"name": "config",
"secret": {
"secretName": "heketi-config-secret"
}
}
]
}
}
}
}
]
}

View File

@@ -1,164 +0,0 @@
{
"kind": "List",
"apiVersion": "v1",
"items": [
{
"kind": "Secret",
"apiVersion": "v1",
"metadata": {
"name": "heketi-db-backup",
"labels": {
"glusterfs": "heketi-db",
"heketi": "db"
}
},
"data": {
},
"type": "Opaque"
},
{
"kind": "Service",
"apiVersion": "v1",
"metadata": {
"name": "heketi",
"labels": {
"glusterfs": "heketi-service",
"deploy-heketi": "support"
},
"annotations": {
"description": "Exposes Heketi Service"
}
},
"spec": {
"selector": {
"name": "heketi"
},
"ports": [
{
"name": "heketi",
"port": 8080,
"targetPort": 8080
}
]
}
},
{
"kind": "Deployment",
"apiVersion": "apps/v1",
"metadata": {
"name": "heketi",
"labels": {
"glusterfs": "heketi-deployment"
},
"annotations": {
"description": "Defines how to deploy Heketi"
}
},
"spec": {
"selector": {
"matchLabels": {
"name": "heketi"
}
},
"replicas": 1,
"template": {
"metadata": {
"name": "heketi",
"labels": {
"name": "heketi",
"glusterfs": "heketi-pod"
}
},
"spec": {
"serviceAccountName": "heketi-service-account",
"containers": [
{
"image": "heketi/heketi:9",
"imagePullPolicy": "Always",
"name": "heketi",
"env": [
{
"name": "HEKETI_EXECUTOR",
"value": "kubernetes"
},
{
"name": "HEKETI_DB_PATH",
"value": "/var/lib/heketi/heketi.db"
},
{
"name": "HEKETI_FSTAB",
"value": "/var/lib/heketi/fstab"
},
{
"name": "HEKETI_SNAPSHOT_LIMIT",
"value": "14"
},
{
"name": "HEKETI_KUBE_GLUSTER_DAEMONSET",
"value": "y"
}
],
"ports": [
{
"containerPort": 8080
}
],
"volumeMounts": [
{
"mountPath": "/backupdb",
"name": "heketi-db-secret"
},
{
"name": "db",
"mountPath": "/var/lib/heketi"
},
{
"name": "config",
"mountPath": "/etc/heketi"
}
],
"readinessProbe": {
"timeoutSeconds": 3,
"initialDelaySeconds": 3,
"httpGet": {
"path": "/hello",
"port": 8080
}
},
"livenessProbe": {
"timeoutSeconds": 3,
"initialDelaySeconds": 10,
"httpGet": {
"path": "/hello",
"port": 8080
}
}
}
],
"volumes": [
{
"name": "db",
"glusterfs": {
"endpoints": "heketi-storage-endpoints",
"path": "heketidbstorage"
}
},
{
"name": "heketi-db-secret",
"secret": {
"secretName": "heketi-db-backup"
}
},
{
"name": "config",
"secret": {
"secretName": "heketi-config-secret"
}
}
]
}
}
}
}
]
}

View File

@@ -1,7 +0,0 @@
{
"apiVersion": "v1",
"kind": "ServiceAccount",
"metadata": {
"name": "heketi-service-account"
}
}

View File

@@ -1,54 +0,0 @@
{
"apiVersion": "v1",
"kind": "List",
"items": [
{
"kind": "Endpoints",
"apiVersion": "v1",
"metadata": {
"name": "heketi-storage-endpoints",
"creationTimestamp": null
},
"subsets": [
{% set nodeblocks = [] %}
{% for node in nodes %}
{% set nodeblock %}
{
"addresses": [
{
"ip": "{{ hostvars[node].ip }}"
}
],
"ports": [
{
"port": 1
}
]
}
{% endset %}
{% if nodeblocks.append(nodeblock) %}{% endif %}
{% endfor %}
{{ nodeblocks|join(',') }}
]
},
{
"kind": "Service",
"apiVersion": "v1",
"metadata": {
"name": "heketi-storage-endpoints",
"creationTimestamp": null
},
"spec": {
"ports": [
{
"port": 1,
"targetPort": 0
}
]
},
"status": {
"loadBalancer": {}
}
}
]
}

View File

@@ -1,44 +0,0 @@
{
"_port_comment": "Heketi Server Port Number",
"port": "8080",
"_use_auth": "Enable JWT authorization. Please enable for deployment",
"use_auth": true,
"_jwt": "Private keys for access",
"jwt": {
"_admin": "Admin has access to all APIs",
"admin": {
"key": "{{ heketi_admin_key }}"
},
"_user": "User only has access to /volumes endpoint",
"user": {
"key": "{{ heketi_user_key }}"
}
},
"_glusterfs_comment": "GlusterFS Configuration",
"glusterfs": {
"_executor_comment": "Execute plugin. Possible choices: mock, kubernetes, ssh",
"executor": "kubernetes",
"_db_comment": "Database file name",
"db": "/var/lib/heketi/heketi.db",
"kubeexec": {
"rebalance_on_expansion": true
},
"sshexec": {
"rebalance_on_expansion": true,
"keyfile": "/etc/heketi/private_key",
"fstab": "/etc/fstab",
"port": "22",
"user": "root",
"sudo": false
}
},
"_backup_db_to_kube_secret": "Backup the heketi database to a Kubernetes secret when running in Kubernetes. Default is off.",
"backup_db_to_kube_secret": false
}

View File

@@ -1,12 +0,0 @@
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gluster
annotations:
storageclass.beta.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/glusterfs
parameters:
resturl: "http://{{ endpoint_address }}:8080"
restuser: "admin"
restuserkey: "{{ heketi_admin_key }}"

View File

@@ -1,34 +0,0 @@
{
"clusters": [
{
"nodes": [
{% set nodeblocks = [] %}
{% for node in nodes %}
{% set nodeblock %}
{
"node": {
"hostnames": {
"manage": [
"{{ node }}"
],
"storage": [
"{{ hostvars[node].ip }}"
]
},
"zone": 1
},
"devices": [
{
"name": "{{ hostvars[node]['disk_volume_device_1'] }}",
"destroydata": false
}
]
}
{% endset %}
{% if nodeblocks.append(nodeblock) %}{% endif %}
{% endfor %}
{{ nodeblocks|join(',') }}
]
}
]
}

View File

@@ -1,2 +0,0 @@
---
heketi_remove_lvm: false

View File

@@ -1,52 +0,0 @@
---
- name: "Install lvm utils (RedHat)"
become: true
package:
name: "lvm2"
state: "present"
when: "ansible_os_family == 'RedHat'"
- name: "Install lvm utils (Debian)"
become: true
apt:
name: "lvm2"
state: "present"
when: "ansible_os_family == 'Debian'"
- name: "Get volume group information."
environment:
PATH: "{{ ansible_env.PATH }}:/sbin" # Make sure we can workaround RH / CentOS conservative path management
become: true
shell: "pvs {{ disk_volume_device_1 }} --option vg_name | tail -n+2"
register: "volume_groups"
ignore_errors: true # noqa ignore-errors
changed_when: false
- name: "Remove volume groups."
environment:
PATH: "{{ ansible_env.PATH }}:/sbin" # Make sure we can workaround RH / CentOS conservative path management
become: true
command: "vgremove {{ volume_group }} --yes"
with_items: "{{ volume_groups.stdout_lines }}"
loop_control: { loop_var: "volume_group" }
- name: "Remove physical volume from cluster disks."
environment:
PATH: "{{ ansible_env.PATH }}:/sbin" # Make sure we can workaround RH / CentOS conservative path management
become: true
command: "pvremove {{ disk_volume_device_1 }} --yes"
ignore_errors: true # noqa ignore-errors
- name: "Remove lvm utils (RedHat)"
become: true
package:
name: "lvm2"
state: "absent"
when: "ansible_os_family == 'RedHat' and heketi_remove_lvm"
- name: "Remove lvm utils (Debian)"
become: true
apt:
name: "lvm2"
state: "absent"
when: "ansible_os_family == 'Debian' and heketi_remove_lvm"

View File

@@ -1,51 +0,0 @@
---
- name: Remove storage class.
command: "{{ bin_dir }}/kubectl delete storageclass gluster"
ignore_errors: true # noqa ignore-errors
- name: Tear down heketi.
command: "{{ bin_dir }}/kubectl delete all,service,jobs,deployment,secret --selector=\"glusterfs=heketi-pod\""
ignore_errors: true # noqa ignore-errors
- name: Tear down heketi.
command: "{{ bin_dir }}/kubectl delete all,service,jobs,deployment,secret --selector=\"glusterfs=heketi-deployment\""
ignore_errors: true # noqa ignore-errors
- name: Tear down bootstrap.
include_tasks: "../../provision/tasks/bootstrap/tear-down.yml"
- name: Ensure there is nothing left over.
command: "{{ bin_dir }}/kubectl get all,service,jobs,deployment,secret --selector=\"glusterfs=heketi-pod\" -o=json"
register: "heketi_result"
until: "heketi_result.stdout | from_json | json_query('items[*]') | length == 0"
retries: 60
delay: 5
- name: Ensure there is nothing left over.
command: "{{ bin_dir }}/kubectl get all,service,jobs,deployment,secret --selector=\"glusterfs=heketi-deployment\" -o=json"
register: "heketi_result"
until: "heketi_result.stdout | from_json | json_query('items[*]') | length == 0"
retries: 60
delay: 5
- name: Tear down glusterfs.
command: "{{ bin_dir }}/kubectl delete daemonset.extensions/glusterfs"
ignore_errors: true # noqa ignore-errors
- name: Remove heketi storage service.
command: "{{ bin_dir }}/kubectl delete service heketi-storage-endpoints"
ignore_errors: true # noqa ignore-errors
- name: Remove heketi gluster role binding
command: "{{ bin_dir }}/kubectl delete clusterrolebinding heketi-gluster-admin"
ignore_errors: true # noqa ignore-errors
- name: Remove heketi config secret
command: "{{ bin_dir }}/kubectl delete secret heketi-config-secret"
ignore_errors: true # noqa ignore-errors
- name: Remove heketi db backup
command: "{{ bin_dir }}/kubectl delete secret heketi-db-backup"
ignore_errors: true # noqa ignore-errors
- name: Remove heketi service account
command: "{{ bin_dir }}/kubectl delete serviceaccount heketi-service-account"
ignore_errors: true # noqa ignore-errors
- name: Get secrets
command: "{{ bin_dir }}/kubectl get secrets --output=\"json\""
register: "secrets"
changed_when: false
- name: Remove heketi storage secret
vars: { storage_query: "items[?metadata.annotations.\"kubernetes.io/service-account.name\"=='heketi-service-account'].metadata.name|[0]" }
command: "{{ bin_dir }}/kubectl delete secret {{ secrets.stdout | from_json | json_query(storage_query) }}"
when: "storage_query is defined"
ignore_errors: true # noqa ignore-errors

View File

@@ -31,7 +31,7 @@ manage-offline-container-images.sh register
## generate_list.sh
This script generates the list of downloaded files and the list of container images by `roles/kubespray-defaults/defaults/main/download.yml` file.
This script generates the list of downloaded files and the list of container images by `roles/kubespray_defaults/defaults/main/download.yml` file.
Run this script will execute `generate_list.yml` playbook in kubespray root directory and generate four files,
all downloaded files url in files.list, all container images in images.list, jinja2 templates in *.template.
@@ -67,3 +67,23 @@ Step(2) download files and run nginx container
```
when nginx container is running, it can be accessed through <http://127.0.0.1:8080/>.
## upload2artifactory.py
After the steps above, this script can recursively upload each file under a directory to a generic repository in Artifactory.
Environment Variables:
- USERNAME -- At least permissions'Deploy/Cache' and 'Delete/Overwrite'.
- TOKEN -- Generate this with 'Set Me Up' in your user.
- BASE_URL -- The URL including the repository name.
Step(3) (optional) upload files to Artifactory
```shell
cd kubespray/contrib/offline/offline-files
export USERNAME=admin
export TOKEN=...
export BASE_URL=https://artifactory.example.com/artifactory/a-generic-repo/
./upload2artifactory.py
```

View File

@@ -5,7 +5,7 @@ CURRENT_DIR=$(cd $(dirname $0); pwd)
TEMP_DIR="${CURRENT_DIR}/temp"
REPO_ROOT_DIR="${CURRENT_DIR%/contrib/offline}"
: ${DOWNLOAD_YML:="roles/kubespray-defaults/defaults/main/download.yml"}
: ${DOWNLOAD_YML:="roles/kubespray_defaults/defaults/main/download.yml"}
mkdir -p ${TEMP_DIR}
@@ -19,12 +19,12 @@ sed -n '/^downloads:/,/download_defaults:/p' ${REPO_ROOT_DIR}/${DOWNLOAD_YML} \
| sed 'N;s#\n# #g' | tr ' ' ':' | sed 's/\"//g' > ${TEMP_DIR}/images.list.template
# add kube-* images to images list template
# Those container images are downloaded by kubeadm, then roles/kubespray-defaults/defaults/main/download.yml
# Those container images are downloaded by kubeadm, then roles/kubespray_defaults/defaults/main/download.yml
# doesn't contain those images. That is reason why here needs to put those images into the
# list separately.
KUBE_IMAGES="kube-apiserver kube-controller-manager kube-scheduler kube-proxy"
for i in $KUBE_IMAGES; do
echo "{{ kube_image_repo }}/$i:{{ kube_version }}" >> ${TEMP_DIR}/images.list.template
echo "{{ kube_image_repo }}/$i:v{{ kube_version }}" >> ${TEMP_DIR}/images.list.template
done
# run ansible to expand templates

View File

@@ -5,7 +5,7 @@
roles:
# Just load default variables from roles.
- role: kubespray-defaults
- role: kubespray_defaults
when: false
- role: download
when: false

View File

@@ -1,4 +1,4 @@
#!/bin/bash
#!/usr/bin/env bash
OPTION=$1
CURRENT_DIR=$(cd $(dirname $0); pwd)
@@ -36,7 +36,7 @@ function create_container_image_tar() {
mkdir ${IMAGE_DIR}
cd ${IMAGE_DIR}
sudo ${runtime} pull registry:latest
sudo --preserve-env=http_proxy,https_proxy,no_proxy ${runtime} pull registry:latest
sudo ${runtime} save -o registry-latest.tar registry:latest
while read -r image
@@ -45,7 +45,7 @@ function create_container_image_tar() {
set +e
for step in $(seq 1 ${RETRY_COUNT})
do
sudo ${runtime} pull ${image}
sudo --preserve-env=http_proxy,https_proxy,no_proxy ${runtime} pull ${image}
if [ $? -eq 0 ]; then
break
fi
@@ -118,6 +118,8 @@ function register_container_images() {
cp ${CURRENT_DIR}/registries.conf ${TEMP_DIR}/registries.conf
sed -i s@"HOSTNAME"@"$(hostname)"@ ${TEMP_DIR}/registries.conf
sudo cp ${TEMP_DIR}/registries.conf /etc/containers/registries.conf
elif [ "$(uname)" == "Darwin" ]; then
echo "This is a Mac, no configuration changes are required"
else
echo "runtime package(docker-ce, podman, nerctl, etc.) should be installed"
exit 1
@@ -125,7 +127,7 @@ function register_container_images() {
tar -zxvf ${IMAGE_TAR_FILE}
if [ "${create_registry}" ]; then
if ${create_registry}; then
sudo ${runtime} load -i ${IMAGE_DIR}/registry-latest.tar
set +e
@@ -146,7 +148,7 @@ function register_container_images() {
if [ "${org_image}" == "ID:" ]; then
org_image=$(echo "${load_image}" | awk '{print $4}')
fi
image_id=$(sudo ${runtime} image inspect ${org_image} | grep "\"Id\":" | awk -F: '{print $3}'| sed s/'\",'//)
image_id=$(sudo ${runtime} image inspect --format "{{.Id}}" "${org_image}")
if [ -z "${file_name}" ]; then
echo "Failed to get file_name for line ${line}"
exit 1

View File

@@ -41,7 +41,7 @@ fi
sudo "${runtime}" container inspect nginx >/dev/null 2>&1
if [ $? -ne 0 ]; then
sudo "${runtime}" run \
sudo --preserve-env=http_proxy,https_proxy,no_proxy "${runtime}" run \
--restart=always -d -p ${NGINX_PORT}:80 \
--volume "${OFFLINE_FILES_DIR}":/usr/share/nginx/html/download \
--volume "${CURRENT_DIR}"/nginx.conf:/etc/nginx/nginx.conf \

View File

@@ -0,0 +1,65 @@
#!/usr/bin/env python3
"""This is a helper script to manage-offline-files.sh.
After running manage-offline-files.sh, you can run upload2artifactory.py
to recursively upload each file to a generic repository in Artifactory.
This script recurses the current working directory and is intended to
be started from 'kubespray/contrib/offline/offline-files'
Environment Variables:
USERNAME -- At least permissions'Deploy/Cache' and 'Delete/Overwrite'.
TOKEN -- Generate this with 'Set Me Up' in your user.
BASE_URL -- The URL including the repository name.
"""
import os
import urllib.request
import base64
def upload_file(file_path, destination_url, username, token):
"""Helper function to upload a single file"""
try:
with open(file_path, 'rb') as f:
file_data = f.read()
request = urllib.request.Request(destination_url, data=file_data, method='PUT') # NOQA
auth_header = base64.b64encode(f"{username}:{token}".encode()).decode()
request.add_header("Authorization", f"Basic {auth_header}")
with urllib.request.urlopen(request) as response:
if response.status in [200, 201]:
print(f"Success: Uploaded {file_path}")
else:
print(f"Failed: {response.status} {response.read().decode('utf-8')}") # NOQA
except urllib.error.HTTPError as e:
print(f"HTTPError: {e.code} {e.reason} for {file_path}")
except urllib.error.URLError as e:
print(f"URLError: {e.reason} for {file_path}")
except OSError as e:
print(f"OSError: {e.strerror} for {file_path}")
def upload_files(base_url, username, token):
""" Recurse current dir and upload each file using urllib.request """
for root, _, files in os.walk(os.getcwd()):
for file in files:
file_path = os.path.join(root, file)
relative_path = os.path.relpath(file_path, os.getcwd())
destination_url = f"{base_url}/{relative_path}"
print(f"Uploading {file_path} to {destination_url}")
upload_file(file_path, destination_url, username, token)
if __name__ == "__main__":
a_user = os.getenv("USERNAME")
a_token = os.getenv("TOKEN")
a_url = os.getenv("BASE_URL")
if not a_user or not a_token or not a_url:
print(
"Error: Environment variables USERNAME, TOKEN, and BASE_URL must be set." # NOQA
)
exit()
upload_files(a_url, a_user, a_token)

View File

@@ -1,3 +0,0 @@
# See the OWNERS docs at https://go.k8s.io/owners
approvers:
- miouge1

View File

@@ -1,5 +1,11 @@
terraform {
required_version = ">= 0.12.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {

View File

@@ -1,246 +0,0 @@
# Kubernetes on Equinix Metal with Terraform
Provision a Kubernetes cluster with [Terraform](https://www.terraform.io) on
[Equinix Metal](https://metal.equinix.com) ([formerly Packet](https://blog.equinix.com/blog/2020/10/06/equinix-metal-metal-and-more/)).
## Status
This will install a Kubernetes cluster on Equinix Metal. It should work in all locations and on most server types.
## Approach
The terraform configuration inspects variables found in
[variables.tf](variables.tf) to create resources in your Equinix Metal project.
There is a [python script](../terraform.py) that reads the generated`.tfstate`
file to generate a dynamic inventory that is consumed by [cluster.yml](../../../cluster.yml)
to actually install Kubernetes with Kubespray.
### Kubernetes Nodes
You can create many different kubernetes topologies by setting the number of
different classes of hosts.
- Master nodes with etcd: `number_of_k8s_masters` variable
- Master nodes without etcd: `number_of_k8s_masters_no_etcd` variable
- Standalone etcd hosts: `number_of_etcd` variable
- Kubernetes worker nodes: `number_of_k8s_nodes` variable
Note that the Ansible script will report an invalid configuration if you wind up
with an *even number* of etcd instances since that is not a valid configuration. This
restriction includes standalone etcd nodes that are deployed in a cluster along with
master nodes with etcd replicas. As an example, if you have three master nodes with
etcd replicas and three standalone etcd nodes, the script will fail since there are
now six total etcd replicas.
## Requirements
- [Install Terraform](https://www.terraform.io/intro/getting-started/install.html)
- [Install Ansible dependencies](/docs/ansible/ansible.md#installing-ansible)
- Account with Equinix Metal
- An SSH key pair
## SSH Key Setup
An SSH keypair is required so Ansible can access the newly provisioned nodes (Equinix Metal hosts). By default, the public SSH key defined in cluster.tfvars will be installed in authorized_key on the newly provisioned nodes (~/.ssh/id_rsa.pub). Terraform will upload this public key and then it will be distributed out to all the nodes. If you have already set this public key in Equinix Metal (i.e. via the portal), then set the public keyfile name in cluster.tfvars to blank to prevent the duplicate key from being uploaded which will cause an error.
If you don't already have a keypair generated (~/.ssh/id_rsa and ~/.ssh/id_rsa.pub), then a new keypair can be generated with the command:
```ShellSession
ssh-keygen -f ~/.ssh/id_rsa
```
## Terraform
Terraform will be used to provision all of the Equinix Metal resources with base software as appropriate.
### Configuration
#### Inventory files
Create an inventory directory for your cluster by copying the existing sample and linking the `hosts` script (used to build the inventory based on Terraform state):
```ShellSession
cp -LRp contrib/terraform/equinix/sample-inventory inventory/$CLUSTER
cd inventory/$CLUSTER
ln -s ../../contrib/terraform/equinix/hosts
```
This will be the base for subsequent Terraform commands.
#### Equinix Metal API access
Your Equinix Metal API key must be available in the `METAL_AUTH_TOKEN` environment variable.
This key is typically stored outside of the code repo since it is considered secret.
If someone gets this key, they can startup/shutdown hosts in your project!
For more information on how to generate an API key or find your project ID, please see
[Accounts Index](https://metal.equinix.com/developers/docs/accounts/).
The Equinix Metal Project ID associated with the key will be set later in `cluster.tfvars`.
For more information about the API, please see [Equinix Metal API](https://metal.equinix.com/developers/api/).
For more information about terraform provider authentication, please see [the equinix provider documentation](https://registry.terraform.io/providers/equinix/equinix/latest/docs).
Example:
```ShellSession
export METAL_AUTH_TOKEN="Example-API-Token"
```
Note that to deploy several clusters within the same project you need to use [terraform workspace](https://www.terraform.io/docs/state/workspaces.html#using-workspaces).
#### Cluster variables
The construction of the cluster is driven by values found in
[variables.tf](variables.tf).
For your cluster, edit `inventory/$CLUSTER/cluster.tfvars`.
The `cluster_name` is used to set a tag on each server deployed as part of this cluster.
This helps when identifying which hosts are associated with each cluster.
While the defaults in variables.tf will successfully deploy a cluster, it is recommended to set the following values:
- cluster_name = the name of the inventory directory created above as $CLUSTER
- equinix_metal_project_id = the Equinix Metal Project ID associated with the Equinix Metal API token above
#### Enable localhost access
Kubespray will pull down a Kubernetes configuration file to access this cluster by enabling the
`kubeconfig_localhost: true` in the Kubespray configuration.
Edit `inventory/$CLUSTER/group_vars/k8s_cluster/k8s_cluster.yml` and comment back in the following line and change from `false` to `true`:
`\# kubeconfig_localhost: false`
becomes:
`kubeconfig_localhost: true`
Once the Kubespray playbooks are run, a Kubernetes configuration file will be written to the local host at `inventory/$CLUSTER/artifacts/admin.conf`
#### Terraform state files
In the cluster's inventory folder, the following files might be created (either by Terraform
or manually), to prevent you from pushing them accidentally they are in a
`.gitignore` file in the `contrib/terraform/equinix` directory :
- `.terraform`
- `.tfvars`
- `.tfstate`
- `.tfstate.backup`
- `.lock.hcl`
You can still add them manually if you want to.
### Initialization
Before Terraform can operate on your cluster you need to install the required
plugins. This is accomplished as follows:
```ShellSession
cd inventory/$CLUSTER
terraform -chdir=../../contrib/terraform/metal init -var-file=cluster.tfvars
```
This should finish fairly quickly telling you Terraform has successfully initialized and loaded necessary modules.
### Provisioning cluster
You can apply the Terraform configuration to your cluster with the following command
issued from your cluster's inventory directory (`inventory/$CLUSTER`):
```ShellSession
terraform -chdir=../../contrib/terraform/equinix apply -var-file=cluster.tfvars
export ANSIBLE_HOST_KEY_CHECKING=False
ansible-playbook -i hosts ../../cluster.yml
```
### Destroying cluster
You can destroy your new cluster with the following command issued from the cluster's inventory directory:
```ShellSession
terraform -chdir=../../contrib/terraform/equinix destroy -var-file=cluster.tfvars
```
If you've started the Ansible run, it may also be a good idea to do some manual cleanup:
- Remove SSH keys from the destroyed cluster from your `~/.ssh/known_hosts` file
- Clean up any temporary cache files: `rm /tmp/$CLUSTER-*`
### Debugging
You can enable debugging output from Terraform by setting `TF_LOG` to `DEBUG` before running the Terraform command.
## Ansible
### Node access
#### SSH
Ensure your local ssh-agent is running and your ssh key has been added. This
step is required by the terraform provisioner:
```ShellSession
eval $(ssh-agent -s)
ssh-add ~/.ssh/id_rsa
```
If you have deployed and destroyed a previous iteration of your cluster, you will need to clear out any stale keys from your SSH "known hosts" file ( `~/.ssh/known_hosts`).
#### Test access
Make sure you can connect to the hosts. Note that Flatcar Container Linux by Kinvolk will have a state `FAILED` due to Python not being present. This is okay, because Python will be installed during bootstrapping, so long as the hosts are not `UNREACHABLE`.
```ShellSession
$ ansible -i inventory/$CLUSTER/hosts -m ping all
example-k8s_node-1 | SUCCESS => {
"changed": false,
"ping": "pong"
}
example-etcd-1 | SUCCESS => {
"changed": false,
"ping": "pong"
}
example-k8s-master-1 | SUCCESS => {
"changed": false,
"ping": "pong"
}
```
If it fails try to connect manually via SSH. It could be something as simple as a stale host key.
### Deploy Kubernetes
```ShellSession
ansible-playbook --become -i inventory/$CLUSTER/hosts cluster.yml
```
This will take some time as there are many tasks to run.
## Kubernetes
### Set up kubectl
- [Install kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/) on the localhost.
- Verify that Kubectl runs correctly
```ShellSession
kubectl version
```
- Verify that the Kubernetes configuration file has been copied over
```ShellSession
cat inventory/alpha/$CLUSTER/admin.conf
```
- Verify that all the nodes are running correctly.
```ShellSession
kubectl version
kubectl --kubeconfig=inventory/$CLUSTER/artifacts/admin.conf get nodes
```
## What's next
Try out your new Kubernetes cluster with the [Hello Kubernetes service](https://kubernetes.io/docs/tasks/access-application-cluster/service-access-application-cluster/).

View File

@@ -1 +0,0 @@
../terraform.py

Some files were not shown because too many files have changed in this diff Show More