Compare commits

...

155 Commits

Author SHA1 Message Date
k8s-infra-cherrypick-robot
a20891ab67 Fix SAN address collection from ansible_default_ipv{4,6} (#12505)
Signed-off-by: Hyeonki Hong <hhk7734@gmail.com>
Co-authored-by: Hyeonki Hong <hhk7734@gmail.com>
2025-08-26 03:02:11 -07:00
Max Gautier
0858e46dc6 Patch versions updates (#12504)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-08-26 02:36:11 -07:00
ChengHao Yang
d695115061 Fix: constant etcd_supported_version to dynamic (#12499)
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-08-25 07:15:10 -07:00
k8s-infra-cherrypick-robot
9a09ac5a40 Fix: Change "empty" definition for PodSecurity Admission configuration (#12478)
Fixes a bug where `kube-apiserver` fails to start if the PodSecurity
configuration file doesn't have the `apiVersion` and `kind` keys.

Signed-off-by: Alejandro Macedo <alex.macedopereira@gmail.com>
Co-authored-by: Alejandro Macedo <alex.macedopereira@gmail.com>
2025-08-19 04:51:36 -07:00
k8s-infra-cherrypick-robot
3a1a2bd4f4 Argo CD : checksum support for the install url (#12467)
Fixes https://github.com/kubernetes-sigs/kubespray/issues/12223

Co-authored-by: Romain Lalaut <rlalaut@proton.me>
2025-08-17 20:09:07 -07:00
Ali Afsharzadeh
842e352767 [release-2.28] Upgrade cilium from 1.17.3 to 1.17.7 (#12471)
Signed-off-by: Ali Afsharzadeh <afsharzadeh8@gmail.com>
2025-08-17 19:11:07 -07:00
ChengHao Yang
4bb24b253c Patch versions updates (#12462)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-08-15 13:23:07 -07:00
k8s-infra-cherrypick-robot
f815c80139 bug: fix missing cilium_enable_bgp_control_plane config (#12432)
Co-authored-by: XuhuiSun95 <ericsun1995@gmail.com>
2025-07-26 22:10:27 -07:00
Max Gautier
86fcc2ba59 Patch versions updates (#12431)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-07-25 01:18:28 -07:00
k8s-infra-cherrypick-robot
e7cf546f37 Fix#12385 cilium typo (#12404)
Signed-off-by: wangsifei99 <wangsifei@kylinos.cn>
Co-authored-by: wangsifei99 <wangsifei@kylinos.cn>
2025-07-15 08:14:24 -07:00
k8s-infra-cherrypick-robot
a59585b6dc Fixed syntax error in _bgp_config dict (#12394)
Co-authored-by: mathgaming <mathgaming@outlook.com>
2025-07-11 00:31:28 -07:00
k8s-infra-cherrypick-robot
2d0cc2b4ea Fix kubeadm upgrade node skipPhases with multiple CP nodes (#12384)
Add 1.32 conditional defaults

Restore support for kubeadm upgrade node --skip-phases < 1.32, apply still needs to be restricted

Co-authored-by: Chad Swenson <chadswen@gmail.com>
2025-07-07 23:35:26 -07:00
k8s-infra-cherrypick-robot
9a08afa3f9 [release-2.28] Cilium: Pass cluster DNS to hubble.peerService in values.yaml.j2 (#12374)
* cilium: pass cluster DNS to hubble.peerService in values.yaml.j2

* Add dedicated Hubble variable defaulting to inventory cluster domain

---------

Co-authored-by: Mustafa Mertcan CAM <mertcancam@gmail.com>
2025-07-04 01:23:25 -07:00
k8s-infra-cherrypick-robot
22e9335484 fix(kubeadm): Conditionally add --skip-phases flag for v1.32.0+ (#12354)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
Co-authored-by: bo.jiang <bo.jiang@daocloud.io>
2025-06-28 05:44:29 -07:00
k8s-infra-cherrypick-robot
4789e9dd89 fix ETCD_INITIAL_CLUSTER config in etcd.env and etcd-events.env (#12352)
Co-authored-by: liuxu <liuxu623@gmail.com>
2025-06-27 23:54:29 -07:00
k8s-infra-cherrypick-robot
9a86253beb fix: add cilium extraConfig values (#12338)
Co-authored-by: atobaum <atobaum@gmail.com>
2025-06-24 00:00:31 -07:00
ChengHao Yang
1e76d9113b Patch versions updates (#12330)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-06-23 19:22:28 -07:00
k8s-infra-cherrypick-robot
d7c00ce698 [release-2.28] Fix: the cluster is upgraded from 2.27 to 2.28 cilium will break (#12324)
* Fix: add cilium remove old resources option

Give users two options: besides skip Cilium, add
`cilium_remove_old_resources`, default is `false`, when set to `true`,
it will remove the content of the old version, but it will cause the
downtime, need to be careful to use.

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Fix: if cilium release exist, the action will set upgrade

`cilium install` is equivalent to `helm install`, it will failed if
cilium relase exist. `cilium version` can know the release exist without
helm binary

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

---------

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
Co-authored-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-19 20:16:51 -07:00
k8s-infra-cherrypick-robot
cf6e96deb0 Add version pinning for AWS tf provider to fix CI (#12327)
Co-authored-by: Chad Swenson <chadswen@gmail.com>
2025-06-19 19:52:52 -07:00
k8s-infra-cherrypick-robot
0e5c532c9e Skip kube-proxy addon phase during kubeadm upgrade if disabled (#12320)
Co-authored-by: Roman Davydchenko <xatteg@gmail.com>
2025-06-18 04:42:51 -07:00
k8s-infra-cherrypick-robot
a8f5277628 fix manage-offline-container-images.sh get image_id (#12316)
Co-authored-by: DearJay <zhongtianjieyi143@gmail.com>
2025-06-15 07:42:58 -07:00
k8s-infra-cherrypick-robot
1290466c53 Add tico88612 as approver (#12292)
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
Co-authored-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-10 10:06:24 -07:00
k8s-infra-cherrypick-robot
6ec991e772 Fix indentation issue in Cilium values file and ensure booleans are lowercase (#12283)
This patch fixes the indentation in the `encryption` section.
Previously configuration like this:

```yml
cilium_encryption_enabled: true
cilium_encryption_type: wireguard
```

Would template to a `values.yaml` file with indentation that looks like this:

```yml
encryption:
  enabled: True
    type: wireguard
    nodeEncryption: False
```

instead of this:

```yml
encryption:
  enabled: true
  type: wireguard
  nodeEncryption: false
```

This syntax issue causes an error during Cilium installation.

This patch also makes all boolean values in this template file go through the `to_json` filter.
Since values like `True` and `False` are not compliant with the YAML v1.2 spec,
avoiding them is preferable.

`to_json` may be used for all other values in this template to ensure we end up with
a valid YAML document in all cases (even when various strings include special characters),
but this was left for another (future) patch.

Co-authored-by: Slavi Pantaleev <slavi@devture.com>
2025-06-10 07:50:24 -07:00
ChengHao Yang
a12e53e845 Bump galaxy.yml version (#12290)
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-06-10 06:44:25 -07:00
Anshuman Agarwala
63cdf87915 Removed equinix provider (#12229) 2025-05-20 03:53:15 -07:00
Max Gautier
175babc4df Move some approvers to emeritus (#12156)
Thanks for you work !
2025-05-20 03:11:17 -07:00
Ekko
6c5c45b328 Allow stopping ubuntu unattended-upgrades (#12174)
Signed-off-by: Ekko Tu <lihai.tu@daocloud.io>
2025-05-20 01:07:16 -07:00
Kubernetes Prow Robot
019cf2ab42 Merge pull request #12101 from tico88612/refactor/cilium-install
Refactor Cilium CNI installation
2025-05-20 01:01:15 -07:00
dependabot[bot]
571e747689 build(deps): bump cryptography from 44.0.3 to 45.0.2 (#12235)
Bumps [cryptography](https://github.com/pyca/cryptography) from 44.0.3 to 45.0.2.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/44.0.3...45.0.2)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-version: 45.0.2
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-19 07:21:15 -07:00
ChengHao Yang
1266527014 Add cilium cli binary hash before 0.18.3
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:15 +08:00
ChengHao Yang
5e2e63ebe3 Make cilium dnsProxy transparent mode configure
When Cilium is configured to replace kube-proxy, it automatically
enables dnsProxy, which can conflict with nodelocaldns.
2025-05-19 08:48:15 +08:00
ChengHao Yang
db290ca686 Add cilium gateway api support
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:15 +08:00
ChengHao Yang
6619d98682 Add cilium hubble export dynamic content
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:15 +08:00
ChengHao Yang
b771d73fe0 Add cilium hubble export file max backups & size mb
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:15 +08:00
ChengHao Yang
65751e8193 Add cilium operator tolerations default values
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:15 +08:00
ChengHao Yang
4c16fc155f Cilium values k8sServiceHost and k8sServicePort use auto
Signed-off-by: ChengHao Yang
<17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:14 +08:00
ChengHao Yang
dcd3461bce Cilium values use image variables
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:14 +08:00
ChengHao Yang
48f75c2c2b Upgrade Cilium related images
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:14 +08:00
ChengHao Yang
a4b73c09a7 Upgrade cilium version to 1.17.3
Signed-off-by: ChengHao Yang
<17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:14 +08:00
ChengHao Yang
af62570110 Change cilium_kube_proxy_replacement to true for CI tests
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:14 +08:00
ChengHao Yang
bebba47eb4 Change kube_owner to root for cilium CI test
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:14 +08:00
ChengHao Yang
86437730de Use cilium-cli install Cilium
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:14 +08:00
ChengHao Yang
6fe64323db Remove old cilium templates install
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:13 +08:00
ChengHao Yang
1e471d5eeb Upgrade outdated cilium_min_version_required
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-19 08:48:11 +08:00
Max Gautier
3a2862ea19 Move checksums to kubespray_defaults/vars (#12234)
The checksums are not a defaults and are not meant to be changed from
the inventories.

Furthermore, role defaults have a lower priority that hosts facts, which
technically means a rogue hosts could hijack the hashes for its
variables.
2025-05-18 16:13:14 -07:00
Jay.H
8a4f4d13f7 fix manage-offline-container-images.sh create_registry (#11964) 2025-05-17 07:25:13 -07:00
ErmolenkoMaxim
46a0dc9a51 Add support for hubble-export-file-max-backups and max-size-mb variables (#12072)
* feat(cilium): add configurable Hubble export log rotation parameters

- Adds support for `cilium_hubble_export_file_max_backups` and `cilium_hubble_export_file_max_size_mb`
- Applies values only if `cilium_hubble_export_file_path` is defined
- Default values are set in role defaults
- Cleans up template logic by removing unnecessary conditionals

* Fix indentation for hubble export settings

* Fix undefined variable issue with ipwrap in kubeconfig override that caused pre-commit errors

* Update main.yml

rollback
2025-05-17 00:35:13 -07:00
Max Gautier
faae36086c Patch versions updates (#12226)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-05-16 14:13:14 -07:00
ERIK
e4c0c427a3 improve NTP package conflict handling (#12212)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2025-05-16 03:55:14 -07:00
Max Gautier
bca5a4ce3b CI: remove ci-not-authorized job (#12225)
This is now handled directly at the failfast-ci level (== integration
Github <-> Gitlab).
The whole pipeline will not be triggered unless:
- The author is a maintainer
- The PR has the /ok-to-test label
2025-05-16 03:27:13 -07:00
Antoine Legrand
5c07c6e6d3 Add option to [not] install coredns via Kubespray (#12218) 2025-05-16 03:23:13 -07:00
Takuya Murakami
c6dfe22a41 Improve logging of kubeadm init failure of first control plane node (#12216)
Split retry task of 'kubeadm init' to show the failure log of
the first execution.
2025-05-16 03:01:13 -07:00
Seena Fallah
ec85b7e2c9 download: respect enable_dns_autoscaler when enabling dnsautoscaler (#12217)
dnsautoscaler should only be enabled when enable_dns_autoscaler is
set to true. without this, it could be enabled without any manifest
actually using it, which makes it a false signal.

Signed-off-by: Seena Fallah <seenafallah@gmail.com>
2025-05-15 12:45:13 -07:00
Kubernetes Prow Robot
acd6872c80 Merge pull request #12219 from VannTen/test/ha_etcd_separate
Fix broken workaround for separate etcd setup
2025-05-15 12:39:14 -07:00
Max Gautier
22d3cf9c2b Move 'pretend certificates' **after** cert distribution
The link target will only exist after we distribute the certs on each node.
2025-05-15 18:35:34 +02:00
Max Gautier
2d3bd8686f Add testcase separate ha-etcd
Also use a distinct node to test certificate distribution.
2025-05-15 18:20:13 +02:00
Hyeonki Hong
2c3b6c9199 feat: add trigger to restart kube-apiserver when config files change (#12172)
* feat: add trigger to restart kube-apiserver when config files change

* fix: remove not upgrade_cluster_setup condition

* refactor: streamline kube-apiserver restart notifications
2025-05-15 06:51:14 -07:00
Max Gautier
a55932e1de Patch versions updates (#12204)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-05-14 18:55:20 -07:00
Max Gautier
973bd2e520 Stop cleaning up containerd packages (#12213)
The switch to not use system packages for containerd packages happened
multiples releases ago ; there should not be any up-to-date installation
of kubespray needing that cleanup.

Remove those steps and variables only used by them.
2025-05-13 21:07:16 -07:00
Kubernetes Prow Robot
ea7331f5fc Merge pull request #12211 from VannTen/cleanup/rename_remove_node
rename-without-hypens: remove-node/pre-remove
2025-05-13 17:13:16 -07:00
Kubernetes Prow Robot
df241800ce Merge pull request #12203 from VannTen/cleanup/rename_bootstrap_os
Rename bootstrap-os to bootstrap_os
2025-05-13 05:03:16 -07:00
Cyclinder
8cc5694580 calico: update calico-kube-controller manifest (#12169) 2025-05-13 01:43:17 -07:00
Max Gautier
1d15baf405 Add compat and deprecation warning for boostrap-os 2025-05-13 09:39:59 +02:00
Max Gautier
47508d5c6e Rename bootstrap-os to bootstrap_os
Role names in ansible collections should not have hyphens.
2025-05-13 09:39:54 +02:00
Max Gautier
2a1ae14275 Compat layer remove-node/pre-remove 2025-05-12 22:22:20 +02:00
Max Gautier
e361def9cd Rename remove-node/pre-remove (no hypens for role in collection) 2025-05-12 22:19:50 +02:00
Max Gautier
fa6888df4c kubernetes_audit: Remove redundant defaults filter (#12208) 2025-05-12 07:23:14 -07:00
Max Gautier
373b952a0c Cleanup CI scripts (#12205)
* Delete unused scripts

- gen_tags.sh: not the right file, produce garbage even if path is fixed
- premoderator.sh: not used since ef6d24a49 (CI require a 'lgtm' or
  'ok-to-test' labels to pass (#11251), 2024-05-31)
- gitlab-branch-cleanup: unused AFAICT

* CI: inline molecule logs

Single use site -> less indirection makes it easier to read.
2025-05-12 05:53:15 -07:00
felipe88alves
9bbd597e20 create cilium_operator_tolerations variable in group_var (#12200)
- This enables ithe override of the tolerations for the cilium-operator deployment
 - default behaviour is to leave the toleration as is unless the var is set
2025-05-12 03:25:15 -07:00
Cheolhui Kim
fceb1516b8 Update: add Cilium LB IP Pool configuration to support ranges (#12140) 2025-05-12 01:39:18 -07:00
Kubernetes Prow Robot
43e19ab281 Merge pull request #12202 from VannTen/cleanup/rename_kubespray_defaults
Rename kubespray-defaults to kubespray_defaults
2025-05-12 01:21:14 -07:00
Max Gautier
4052cd5237 Add compat and deprecation warning for kubespray-defaults 2025-05-12 09:46:07 +02:00
Kim Hyunyoung, Abel
e1be469995 fix: do not mount hubble-ui tls volume when cilium_hubble_tls_generate is false (#12143) 2025-05-11 20:27:14 -07:00
Max Gautier
23d8c9a820 CI: enabled all jobs on daily CI (#12207) 2025-05-11 19:51:14 -07:00
Max Gautier
e618421697 Don't run upgrade-patch jobs on forks (#12206)
With the current github-workflow setup, workflows are triggered on every
forked repository (which is quite wasteful).

Add a condition to only run on the main repository.
2025-05-10 06:15:14 -07:00
Max Gautier
7db2aa1cba Rename kubespray-defaults to kubespray_defaults
Role names in ansible collection should not contains hyphens.
2025-05-10 10:04:37 +02:00
Kubernetes Prow Robot
0c8dfb8e43 Merge pull request #12185 from VannTen/cleanup/iproute_with_the_rest
Move package installation to bootstrap-os
2025-05-09 20:49:14 -07:00
Max Gautier
25e4fa17a8 Split kubespray-defaults (-> network_facts)
kubespray-defaults currently does two things:
- records a number of default variable values (in particular values used
  in several places)
- gather and compose some complex network facts (in particular,
  `fallback_ip` and `no_proxy`

There is no actual reason to couple those two things, and it makes using
defaults more difficult (because computing the network facts is somewhat
expensive, we don't want to do it willy-nilly)

Split the two and adjust import paths as needed.
2025-05-09 21:14:26 +02:00
Max Gautier
bb4b2af02e Drop install of python-libselinux for RHEL family below 8
RHEL 7 and derivates support has been removed from some time, clean up
of leftovers.
2025-05-09 21:14:25 +02:00
ChengHao Yang
27e93ee9f6 Feat: Gateway API early installation (#12189)
The Gateway API needs to be installed first if you want to use Cilium's
Gateway API functionality. The Gateway API is just CRD without any Pod,
Deployment, etc., so I think it can be brought forward to before the CNI
installation.

Signed-off-by: ChengHao Yang
2025-05-09 09:47:14 -07:00
dependabot[bot]
65bcddb9fd build(deps): bump cryptography from 44.0.2 to 44.0.3 (#12190)
Bumps [cryptography](https://github.com/pyca/cryptography) from 44.0.2 to 44.0.3.
- [Changelog](https://github.com/pyca/cryptography/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pyca/cryptography/compare/44.0.2...44.0.3)

---
updated-dependencies:
- dependency-name: cryptography
  dependency-version: 44.0.3
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-05-09 01:53:14 -07:00
Chad Swenson
76707073c4 Fix indentation on AuthorizationConfiguration task (#12197) 2025-05-09 00:05:19 -07:00
Bas
a104fb6a00 kubedns_version no longer used (#12201)
This variable is documented, but not found in the rest of the sources.
2025-05-09 00:01:14 -07:00
ERIK
1c4b18b089 fix: arm64 checksums for youki and kata-containers (#12173)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2025-05-08 19:05:14 -07:00
Max Gautier
d6d87e9a83 Move cilium_deploy_additionnaly to kubespray-default (#12191)
Instead of using default(false) all over the place, use
kubespray-defaults
2025-05-07 05:05:17 -07:00
Max Gautier
985e4ebb23 Remove versions from inventory sample (#12164)
The recommended usage of kubespray is to use the default versions.
So putting them in inventory/sample is not really very helpful, and
causes:
- churn (keeping the inventory/sample up to date)
- support issues (mismatch between defaults and sample inventory)

Remove all concrete versions from the inventory sample.
2025-05-06 08:43:14 -07:00
Max Gautier
fcc294600c Workaround missing etcd certds on control plane node (#12181) 2025-05-05 01:05:57 -07:00
Max Gautier
a7d681abff Install iputils with other packages 2025-05-04 21:22:49 +02:00
Max Gautier
5867fa1b9f Move back iproute install to system_packages
Packages are now installed before network facts collection, so we can
install iproute with the rest.
2025-05-04 21:22:49 +02:00
Max Gautier
1e79c7b3cb Move package install to bootstrap-os 2025-05-04 21:22:48 +02:00
Max Gautier
34d64d4d04 Remove outdated comment
bootstrap-os does not do anything in sudoers since e2ad6aad5 (bootstrap:
rework role (#4045), 2019-02-11).

So SSH pipelining working is effectively a pre-requisite anyway.
2025-05-04 21:22:48 +02:00
Max Gautier
87726faab4 Move check 'sorted pkgs list to pre-commit'
This is a lint check, which should not live in the playbook itself.
2025-05-04 21:22:47 +02:00
Max Gautier
1b9919547a Split 'offline' assert into their own role
The preinstall assert cover a number of things, many of which depends
only on the inventory, and can be run without any ansible_facts
collected.

Split them off to simplify re-ordering.
2025-05-04 21:22:46 +02:00
Kubernetes Prow Robot
84d96d5195 Merge pull request #12165 from tico88612/fix/failing-test-coredns-autoscaler
Feat: add `dns_autoscaler_affinity` and remove in-place values
2025-05-03 13:17:55 -07:00
ChengHao Yang
1374a97787 Test: ubuntu22-calico-all-in-one-upgrade disable dns autoscaler
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-05-04 00:23:05 +08:00
bin.pan
6f0fc020e8 update containerd.options key name (#12170) 2025-05-02 23:27:55 -07:00
Takuya Murakami
f58a6e2057 docs: Fix offline-environment.md to add 'v' prefix of some versions (#12166)
* docs: Fix offline-environment.md to add 'v' prefix of some versions

Now some version variables (kube_version, etcd_version, etc) don't have 'v' prefix,
so you need to add 'v' prefix to download URLs.

* fix: Fix offline.yml to add 'v' prefix of some versions
2025-05-02 01:57:55 -07:00
Ali Afsharzadeh
09fad4886a Fix path to facts.yml in node facts refresh section (#12177) 2025-05-02 00:39:56 -07:00
Ho Kim
c47711c2f2 fix: correct indent of cpuManagerPolicyOptions (#12123) 2025-05-02 00:27:56 -07:00
Karthik S
a3e6e66204 Etcd Certificates are not generated when adding nodes to an existing cluster with scale.yml (#12120)
* [Issue-12117]-Certificates for the new hosts are not generated during scale.yml

* [Issue-12117]-Certificates for the new hosts are not generated during scale.yml

* [Issue-12117]-Certificates for the new hosts are not generated during scale.yml
2025-05-02 00:03:56 -07:00
ChengHao Yang
2907936c85 Feat: add dns_autoscaler_affinity remove in-place values
Upstream has removed affinity, and fix upgrade failing test.

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-04-28 19:18:19 +08:00
ChengHao Yang
71a323039f Fix: kubelet-csr-approver moves to regular application installation (#12141)
This commit fixed the process to ensure that CCM is installed first to
avoid the chicken-and-egg problem.

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-04-28 01:27:26 -07:00
ChengHao Yang
5e5e509698 Revert "Update cluster-proportional-autoscaler to v1.9.0 (#11982)" (#12168)
This reverts commit 16841a1fb0.
2025-04-28 01:23:32 -07:00
Takuya Murakami
4a598c1ef3 Make kubernetes 1.32.4 default (#12161) 2025-04-25 01:22:30 -07:00
Aviral Agarwal
1da9f0dec4 Fixed kube-vip to use kube-vip/kube-vip-iptables image instead of kube-vip/kube-vip when lb_fwdmethod or kube_vip_lb_fwdmethod is set to masquerade (#12145) 2025-04-24 15:54:30 -07:00
ShinyaIshitobi
629a690886 fix: Enable NRI for containerd and disable plugin when nri_enabled is false (#12152)
* fix(containerd): always render NRI plugin block with conditional disable flag

* feat: enable Node Resource Interface plugin when using containerd

* fix: remove the

* fix: fix for linter
2025-04-24 01:40:33 -07:00
Mathieu Parent
16841a1fb0 Update cluster-proportional-autoscaler to v1.9.0 (#11982) 2025-04-24 01:32:37 -07:00
ERIK
22c19a40fa feat: Update containerd and nerdctl checksums to latest versions (#12154)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2025-04-24 01:02:31 -07:00
ERIK
8f41a2886d Update version comparison syntax and optimize whitespace (#12146)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2025-04-24 00:56:31 -07:00
Max Gautier
38cea5b866 Patch versions updates (#12119)
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
2025-04-23 21:48:30 -07:00
Ekko
4177289ef6 Fix typo in .gitlab-ci/kubevirt.yml (#12134)
Signed-off-by: Ekko Tu <lihai.tu@daocloud.io>
2025-04-18 03:59:06 -07:00
Kubernetes Prow Robot
4ad9f9b535 Merge pull request #11763 from tico88612/feat/gateway-api-v1.2.1
Refactor Gateway API installation process and bump Gateway API v1.2.1
2025-04-11 08:38:42 -07:00
ChengHao Yang
6f58b33de0 Deprecate gateway_api_experimental_channel
Please use `gateway_api_channel` and set `experimental`.

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-04-11 23:04:01 +08:00
ChengHao Yang
9456e792f1 Remove unused Gateway API template
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-04-11 22:57:00 +08:00
ChengHao Yang
7f60dda565 Refactor Gateway API manifests installation process
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-04-11 22:57:00 +08:00
ChengHao Yang
582fe2cbde Add Gateway API download information in kubespray-default
Remove old variables in kubernetes-apps/gateway_api

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-04-11 22:57:00 +08:00
Max Gautier
79fbfdf271 component_hash_update: support calico_crds (#12122)
- add support for "no_arch" downloads: arch-indendendant files such as
  YAML manifests, helm charts, etc.
- wire calico_crds with it.
2025-04-10 02:18:47 -07:00
ChengHao Yang
cfaf397d4a Bump: OpenStack Cloud Controller Manager upgrade to v1.32.0 (#12121)
Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-04-10 01:44:41 -07:00
Kubernetes Prow Robot
2f404de77c Merge pull request #12037 from VannTen/ci/convert_vagrant_to_kubevirt_2
CI: convert remaining vagrant jobs (except IPv6) to kubevirt + cleanups
2025-04-09 01:16:42 -07:00
Mohammd Reza Mollasalehi
d304966d75 doc: fix a broken link in the Calico documentation (#12108) (#12109) 2025-04-08 06:32:46 -07:00
ChengHao Yang
4ce5510c1a [rbd-provisioner] deprecate outdated application and documentation (#12114)
* Cleanup: deprecate rbd-provisioner application

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Docs: remove rbd-provisioner application

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

---------

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-04-08 06:22:44 -07:00
ChengHao Yang
8032b8281d [cephfs-provisioner] deprecate outdated application and documentation (#12113)
* Cleanup: deprecated CephFS application

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Docs: Remove CephFS Application

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

---------

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-04-08 03:08:39 -07:00
Farshad Asadpour
45ecceb3e1 docs(terraform): update command for destroying infrastructure in README (#12111) 2025-04-08 02:16:39 -07:00
Max Gautier
5a6ef1dafa Timeout on RHEL subscription check (#12115)
subscription-manager status can in some circumstances just never
terminates, with nothing indicating the problem from the Ansible
playbook log.
This makes it difficult to find the hosts misbehaving.

Add a timeout to the subscription checks (defaulting to 3 minutes). This
should be more than enough for normal circumstances while allowing
easier troubleshooting, as the hosts will be FAILED instead of the
playbook just waiting indefinitely.
2025-04-08 01:24:44 -07:00
Max Gautier
0ae9ab36ce CI: Pin github actions for security (#12105)
Dependabot can still upgrade the action version.
2025-04-03 06:22:38 -07:00
Bas
cf48915657 Documenting offline installation with secure files repo and registry. (#11993)
* Add config for addon helm and local_path_provisioner

* Documenting offline installation with secure files_repo

* Documenting offline installation with secure registry
2025-04-03 02:06:37 -07:00
Fredrik Liv
6f74ef17f7 Upcloud: Add possibility to setup cluster using nodes with no public IPs (#11696)
* terraform upcloud: Added possibility to set up nodes with only private IPs

* terraform upcloud: add support for gateway in private zone

* terraform upcloud: split LB proxy protocol config per backend

* terraform upcloud: fix flexible plans

* terraform upcloud: Removed overview of cluster setup

---------

Co-authored-by: davidumea <david.andersson@elastisys.com>
2025-04-01 07:58:42 -07:00
Max Gautier
fe2ab898b8 component_hash_update: remove obsolete todos (#12098) 2025-03-31 15:18:35 -07:00
dependabot[bot]
c8b8567781 build(deps): bump actions/checkout from 3 to 4 (#12089)
Bumps [actions/checkout](https://github.com/actions/checkout) from 3 to 4.
- [Release notes](https://github.com/actions/checkout/releases)
- [Changelog](https://github.com/actions/checkout/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/checkout/compare/v3...v4)

---
updated-dependencies:
- dependency-name: actions/checkout
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-31 01:40:34 -07:00
dependabot[bot]
bf86c14d35 build(deps): bump redhat-plumbers-in-action/advanced-issue-labeler (#12090)
Bumps [redhat-plumbers-in-action/advanced-issue-labeler](https://github.com/redhat-plumbers-in-action/advanced-issue-labeler) from 2 to 3.
- [Release notes](https://github.com/redhat-plumbers-in-action/advanced-issue-labeler/releases)
- [Commits](https://github.com/redhat-plumbers-in-action/advanced-issue-labeler/compare/v2...v3)

---
updated-dependencies:
- dependency-name: redhat-plumbers-in-action/advanced-issue-labeler
  dependency-type: direct:production
  update-type: version-update:semver-major
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-31 01:14:35 -07:00
dependabot[bot]
e47eb4bc7f build(deps): bump pytest-testinfra from 10.1.1 to 10.2.2 (#12096)
Bumps [pytest-testinfra](https://github.com/pytest-dev/pytest-testinfra) from 10.1.1 to 10.2.2.
- [Release notes](https://github.com/pytest-dev/pytest-testinfra/releases)
- [Changelog](https://github.com/pytest-dev/pytest-testinfra/blob/main/CHANGELOG.rst)
- [Commits](https://github.com/pytest-dev/pytest-testinfra/compare/10.1.1...10.2.2)

---
updated-dependencies:
- dependency-name: pytest-testinfra
  dependency-type: direct:production
  update-type: version-update:semver-minor
...

Signed-off-by: dependabot[bot] <support@github.com>
Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com>
2025-03-31 01:10:35 -07:00
Max Gautier
5222f48978 auto-update: use a branch prefix rather than suffix (#12097)
This is more in-line with dependabot and similar auto-updaters.

Reduce ci coverage on github action updating (it does not change
kubespray code, no need for testing).
2025-03-31 01:04:36 -07:00
Max Gautier
7b6b7318b2 Remove unused manifest (docs) (#12092)
This file is no longer referenced since e0d67367e (Update installation
doc with vagrant (#8406), 2022-01-11).
2025-03-29 11:26:34 -07:00
Kubernetes Prow Robot
f02d313fee Merge pull request #12093 from VannTen/cleanup/contrib
Cleanup old things in contrib/
2025-03-29 10:16:34 -07:00
Max Gautier
7c9870d15b Remove contrib/mitogen
- the playbook does not work
- the mitogen version is not up to date

This strongly suggests this is not used ; let's drop it.
2025-03-28 09:49:28 +01:00
Max Gautier
c8ea1468d1 Remove unmaintained contrib: kvm-setup 2025-03-28 09:39:30 +01:00
Max Gautier
ad26fe77f5 [contrib] Remove heketi and glusterfs (#12091)
* Remove heketi

Heketi is no longer developed or supported and should not be used
anymore.
Remove the contrib playbook.

* Remove contrib glusterfs

Glusterfs integration with glusterfs is now either deprecated or
unsupported.

Other storage solutions should be preferred.
2025-03-27 15:32:38 -07:00
Max Gautier
aae53a9df5 Add github workflow to auto update hashes daily (#12053) 2025-03-27 14:10:34 -07:00
Farshad Asadpour
1513254622 fix(remove-node): Ensure safety and validation for node removal process (#12085)
This commit enhances the node removal playbook's reliability and safety by implementing the following changes:

1. **Node Validation**: Added a validation step using assert to ensure the `node` variable is defined and contains nodes. If the list is empty or undefined, the playbook fails early, preventing accidental operations on the entire cluster.

2. **Removed Defaulting for Hosts**: Updated tasks to enforce explicit `node` variable input without defaulting to critical groups (e.g., `etcd:k8s_cluster:calico_rr`). By validating `node` beforehand, tasks now solely rely on user-provided input and safely avoid unintended targeting.

3. **Explicit User Confirmation**: Enhanced the confirmation prompt to clarify the scope of the operation. The admin is now required to explicitly confirm node state deletion, ensuring a deliberate decision before proceeding.

These improvements strengthen the reliability and safety of the `remove-node.yml` playbook by eliminating ambiguous behavior, preventing misconfigurations, and ensuring clear interaction during node removal tasks.
2025-03-27 06:10:34 -07:00
Ricky Kwan
4a5b524b98 Ensure metrics port exists for nodelocaldns/nodelocaldns-second daemonsets (#11998)
- update metrics port to use port variable
- unconditionally define ports
2025-03-27 04:14:34 -07:00
Max Gautier
aa0c0851f8 Upgrade kube-router (#12066)
- This happens to fix the fact that kube-router is broken when using the
  service proxy: https://github.com/cloudnativelabs/kube-router/issues/1558
2025-03-26 17:04:38 -07:00
Max Gautier
0fc56ed344 CI: fix terraform
- add default testcase
- fix ansible ssh connection
2025-03-26 20:05:26 +01:00
Max Gautier
5c4e597987 CI: workaround build: disable rebase 2025-03-26 20:05:25 +01:00
Max Gautier
ef133fd93d CI: cleanups leftovers things
include_vars is redundant as the file is already included by extra_vars
2025-03-26 20:05:25 +01:00
Max Gautier
f6ca3bf477 CI: simplify image build job 2025-03-26 20:05:24 +01:00
Max Gautier
b9e251ac7a CI: cleanup terraform + deduplicate and simplify 2025-03-26 20:05:23 +01:00
Max Gautier
43fceebdd3 CI: convert vagrant jobs to kubevirt
Vagrant jobs needs a big cache which makes them slow / sometimes stuck
completely. Using the kubevirt provisionning playbook is now
significantly faster, so do just that.

Having only one provisionner in CI will also allows us to remove some of
the custom runners executors we use for vagrant, and more generally
reduce the CI maintenance.

Our kubevirt CI platform does not support ivp6 yet, so we keep the
relevant jobs in vagrant, but we'll migrate them as well as soon as
possible.
2025-03-26 20:05:21 +01:00
Max Gautier
862aec4dc6 CI: remove 'packet' from jobs name + rename to kubevirt
This is more accurate, the name 'packet' being an aterfact of history
(the Kubevirt jobs used to run on Packet, the previous name of Equinix)
2025-03-26 14:32:26 +01:00
Max Gautier
4f3b214ef5 CI: streamline packet jobs definition
- Take advantage of `parallel:matrix` to make the jobs definition shorter
  and more readable.
- Remove helper scripts which are no longer needed
- Remove redundant indirection in the gitlab-ci pipelines definitions
  (only one user)
2025-03-26 14:32:24 +01:00
ERIK
a4843eaf5e fix: missing 'v' prefix in offline image tags (#12086)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2025-03-26 06:26:34 -07:00
Kay Yan
0f9f9fb569 support kube-proxy nftables (#12060)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2025-03-26 01:32:33 -07:00
Farshad Asadpour
e7c70d6169 fix(ingress-nginx): Upgrade ingress-nginx to v1.12.1 and webhook certgen image to v1.5.2 (#12075)
This commit upgrades ingress-nginx to version v1.12.1, addressing multiple critical vulnerabilities including CVE-2025-1974, CVE-2025-1097, CVE-2025-1098, CVE-2025-24513, and CVE-2025-24514 as detailed in the ingress-nginx release notes: https://github.com/kubernetes/ingress-nginx/releases/tag/controller-v1.12.1

Important Notes:
- Fixing CVE-2025-1974 required disabling validation of the generated NGINX configuration during validation of Ingress resources. Invalid Ingress resources may stop the NGINX configuration from being updated.
- Recommended mitigations include enabling annotation validation and disabling snippet annotations.

Alongside this upgrade, the `ingress_nginx_kube_webhook_certgen_image_tag` has been updated to v1.5.2 for compatibility, based on: https://github.com/kubernetes/ingress-nginx/pull/13066

Changelog:
- Updated ingress-nginx version to v1.12.1 in Kubespray.
- Updated `ingress_nginx_kube_webhook_certgen_image_tag` in `roles/kubespray-defaults/defaults/main/download.yml` to v1.5.2.

Fixes: https://github.com/kubernetes-sigs/kubespray/issues/12073
2025-03-25 09:10:38 -07:00
Max Gautier
cd9c21b7d6 CI: fix wrong git config env variables (#12079) 2025-03-25 05:48:39 -07:00
ChengHao Yang
36cd894d58 Bump Docker default version to 28.0 (#12070)
* Cleanup: unsupport docker version

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Cleanup: unsupport OS rhel7

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Feat: upgrade docker package

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Docs: update rhel docker link

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Feat: upgrade docker version

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

* Docs: update docker version

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>

---------

Signed-off-by: ChengHao Yang <17496418+tico88612@users.noreply.github.com>
2025-03-25 04:38:32 -07:00
Kay Yan
b38312d007 Upgrade CI for openSuse 15.6 (#12074)
Signed-off-by: Kay Yan <kay.yan@daocloud.io>
2025-03-25 03:52:33 -07:00
415 changed files with 2736 additions and 39354 deletions

View File

@@ -39,5 +39,7 @@ exclude_paths:
- .github
- .ansible
- .cache
- .gitlab-ci.yml
- .gitlab-ci
mock_modules:
- gluster.gluster.gluster_volume

View File

@@ -12,3 +12,10 @@ updates:
patterns:
- molecule
- molecule-plugins*
- package-ecosystem: "github-actions"
directory: "/"
labels:
- release-note-none
- ci-short
schedule:
interval: "weekly"

View File

@@ -13,16 +13,16 @@ jobs:
issues: write
steps:
- uses: actions/checkout@v3
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683
- name: Parse issue form
uses: stefanbuck/github-issue-parser@v3
uses: stefanbuck/github-issue-parser@2ea9b35a8c584529ed00891a8f7e41dc46d0441e
id: issue-parser
with:
template-path: .github/ISSUE_TEMPLATE/bug-report.yaml
- name: Set labels based on OS field
uses: redhat-plumbers-in-action/advanced-issue-labeler@v2
uses: redhat-plumbers-in-action/advanced-issue-labeler@39087a4b30cb98d57f25f34d617a6af8163c17d9
with:
issue-form: ${{ steps.issue-parser.outputs.jsonString }}
section: os

View File

@@ -0,0 +1,55 @@
name: Upgrade Kubespray components with new patches versions - all branches
on:
schedule:
- cron: '22 2 * * *' # every day, 02:22 UTC
workflow_dispatch:
permissions: {}
jobs:
get-releases-branches:
if: github.repository == 'kubernetes-sigs/kubespray'
runs-on: ubuntu-latest
outputs:
branches: ${{ steps.get-branches.outputs.data }}
steps:
- uses: octokit/graphql-action@8ad880e4d437783ea2ab17010324de1075228110
id: get-branches
with:
query: |
query get_release_branches($owner:String!, $name:String!) {
repository(owner:$owner, name:$name) {
refs(refPrefix: "refs/heads/",
first: 0, # TODO increment once we have release branch with the new checksums format
query: "release-",
orderBy: {
field: ALPHABETICAL,
direction: DESC
}) {
nodes {
name
}
}
}
}
variables: |
owner: ${{ github.repository_owner }}
name: ${{ github.event.repository.name }}
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
update-versions:
needs: get-releases-branches
strategy:
fail-fast: false
matrix:
branch:
- name: ${{ github.event.repository.default_branch }}
- ${{ fromJSON(needs.get-releases-branches.outputs.branches).repository.refs.nodes }}
uses: ./.github/workflows/upgrade-patch-versions.yml
permissions:
contents: write
pull-requests: write
name: Update patch updates on ${{ matrix.branch.name }}
with:
branch: ${{ matrix.branch.name }}

View File

@@ -0,0 +1,44 @@
on:
workflow_call:
inputs:
branch:
description: Which branch to update with new patch versions
default: master
required: true
type: string
jobs:
update-patch-versions:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@11bd71901bbe5b1630ceea73d27597364c9af683
with:
ref: ${{ inputs.branch }}
- uses: actions/setup-python@v5
with:
python-version: '3.13'
cache: 'pip'
- run: pip install scripts/component_hash_update pre-commit
- run: update-hashes
env:
API_KEY: ${{ secrets.GITHUB_TOKEN }}
- uses: actions/cache@v4
with:
key: pre-commit-hook-propagate
path: |
~/.cache/pre-commit
- run: pre-commit run --all-files propagate-ansible-variables
continue-on-error: true
- uses: peter-evans/create-pull-request@271a8d0340265f705b14b6d32b9829c1cb33d45e
with:
commit-message: Patch versions updates
title: Patch versions updates - ${{ inputs.branch }}
labels: bot
branch: component_hash_update/${{ inputs.branch }}
sign-commits: true
body: |
/kind feature
```release-note
NONE
```

View File

@@ -8,11 +8,11 @@ stages:
variables:
FAILFASTCI_NAMESPACE: 'kargo-ci'
GITLAB_REPOSITORY: 'kargo-ci/kubernetes-sigs-kubespray'
GIT_CONFIG_COUNT: 1
GIT_CONFIG_KEY_0: user.key
GIT_CONFIG_COUNT: 2
GIT_CONFIG_KEY_0: user.email
GIT_CONFIG_VALUE_0: "ci@kubespray.io"
GIT_CONFIG_KEY_1: user.name
GIT_CONFIG_VALUE_1: "CI"
GIT_CONFIG_VALUE_1: "Kubespray CI"
ANSIBLE_FORCE_COLOR: "true"
MAGIC: "ci check this"
GS_ACCESS_KEY_ID: $GS_KEY
@@ -31,12 +31,12 @@ variables:
ANSIBLE_VERBOSITY: 2
RECOVER_CONTROL_PLANE_TEST: "false"
RECOVER_CONTROL_PLANE_TEST_GROUPS: "etcd[2:]:kube_control_plane[1:]"
TERRAFORM_VERSION: 1.3.7
TF_VERSION: 1.3.7
PIPELINE_IMAGE: "$CI_REGISTRY_IMAGE/pipeline:${CI_PIPELINE_ID}-${CI_COMMIT_SHORT_SHA}"
before_script:
- ./tests/scripts/rebase.sh
- mkdir -p /.ssh
- mkdir -p cluster-dump $ANSIBLE_INVENTORY
.job: &job
tags:
@@ -55,53 +55,13 @@ before_script:
extends: .job
needs:
- pipeline-image
- ci-not-authorized
- pre-commit # lint
- vagrant-validate # lint
.testcases: &testcases
extends: .job-moderated
interruptible: true
before_script:
- update-alternatives --install /usr/bin/python python /usr/bin/python3 1
- ./tests/scripts/rebase.sh
- ./tests/scripts/testcases_prepare.sh
script:
- ./tests/scripts/testcases_run.sh
after_script:
- ./tests/scripts/testcases_cleanup.sh
# For failfast, at least 1 job must be defined in .gitlab-ci.yml
# Premoderated with manual actions
ci-not-authorized:
stage: build
before_script: []
after_script: []
rules:
# LGTM or ok-to-test labels
- if: $PR_LABELS =~ /.*,(lgtm|approved|ok-to-test).*|^(lgtm|approved|ok-to-test).*/i
variables:
CI_OK_TO_TEST: '0'
when: always
- if: $CI_PIPELINE_SOURCE == "schedule" || $CI_PIPELINE_SOURCE == "trigger"
variables:
CI_OK_TO_TEST: '0'
- if: $CI_COMMIT_BRANCH == "master"
variables:
CI_OK_TO_TEST: '0'
- when: always
variables:
CI_OK_TO_TEST: '1'
script:
- exit $CI_OK_TO_TEST
tags:
- ffci
needs: []
include:
- .gitlab-ci/build.yml
- .gitlab-ci/lint.yml
- .gitlab-ci/terraform.yml
- .gitlab-ci/packet.yml
- .gitlab-ci/kubevirt.yml
- .gitlab-ci/vagrant.yml
- .gitlab-ci/molecule.yml

View File

@@ -1,5 +1,5 @@
---
.build-container:
pipeline-image:
cache:
key: $CI_COMMIT_REF_SLUG
paths:
@@ -11,23 +11,19 @@
name: gcr.io/kaniko-project/executor:debug
entrypoint: ['']
variables:
TAG: $CI_COMMIT_SHORT_SHA
PROJECT_DIR: $CI_PROJECT_DIR
DOCKERFILE: Dockerfile
GODEBUG: "http2client=0"
before_script:
- echo "{\"auths\":{\"$CI_REGISTRY\":{\"auth\":\"$(echo -n ${CI_REGISTRY_USER}:${CI_REGISTRY_PASSWORD} | base64)\"}}}" > /kaniko/.docker/config.json
# TODO: remove the override
# currently rebase.sh depends on bash (not available in the kaniko image)
# once we have a simpler rebase (which should be easy if the target branch ref is available as variable
# we'll be able to rebase here as well hopefully
before_script: []
script:
- echo "{\"auths\":{\"$CI_REGISTRY\":{\"auth\":\"$(echo -n ${CI_REGISTRY_USER}:${CI_REGISTRY_PASSWORD} | base64)\"}}}" > /kaniko/.docker/config.json
- /kaniko/executor --cache=true
--cache-dir=image-cache
--context $PROJECT_DIR
--dockerfile $PROJECT_DIR/$DOCKERFILE
--context $CI_PROJECT_DIR
--dockerfile $CI_PROJECT_DIR/pipeline.Dockerfile
--label 'git-branch'=$CI_COMMIT_REF_SLUG
--label 'git-tag=$CI_COMMIT_TAG'
--destination $PIPELINE_IMAGE
--log-timestamp=true
pipeline-image:
extends: .build-container
variables:
DOCKERFILE: pipeline.Dockerfile

155
.gitlab-ci/kubevirt.yml Normal file
View File

@@ -0,0 +1,155 @@
---
.kubevirt:
extends: .job-moderated
interruptible: true
script:
- ansible-playbook tests/cloud_playbooks/create-kubevirt.yml
-c local -e @"tests/files/${TESTCASE}.yml"
- ./tests/scripts/testcases_run.sh
variables:
ANSIBLE_TIMEOUT: "120"
tags:
- ffci
needs:
- pipeline-image
# TODO: generate testcases matrixes from the files in tests/files/
# this is needed to avoid the need for PR rebasing when a job was added or removed in the target branch
# (currently, a removed job in the target branch breaks the tests, because the
# pipeline definition is parsed by gitlab before the rebase.sh script)
# CI template for PRs
pr:
stage: deploy-part1
rules:
- if: $PR_LABELS =~ /.*ci-short.*/
when: manual
allow_failure: true
- if: $CI_COMMIT_BRANCH =~ /^pr-.*$/
when: on_success
- if: $CI_PIPELINE_SOURCE == "schedule" && $CI_PIPELINE_SCHEDULE_DESCRIPTION == "daily-ci"
when: on_success
- when: manual
allow_failure: true
extends: .kubevirt
parallel:
matrix:
- TESTCASE:
- almalinux8-calico
- almalinux9-crio
- almalinux9-kube-ovn
- debian11-calico-collection
- debian11-macvlan
- debian12-cilium
- fedora39-kube-router
# FIXME: this test if broken (perma-failing)
- openeuler24-calico
- opensuse15-6-calico
- rockylinux8-calico
- rockylinux9-cilium
- ubuntu20-calico-all-in-one-hardening
- ubuntu20-cilium-sep
- ubuntu20-flannel-collection
- ubuntu20-kube-router-sep
- ubuntu20-kube-router-svc-proxy
- ubuntu22-calico-all-in-one
- ubuntu22-calico-all-in-one-upgrade
- ubuntu24-calico-etcd-datastore
- ubuntu24-ha-separate-etcd
# The ubuntu20-calico-all-in-one jobs are meant as early stages to prevent running the full CI if something is horribly broken
ubuntu20-calico-all-in-one:
stage: deploy-part1
extends: .kubevirt
variables:
TESTCASE: ubuntu20-calico-all-in-one
rules:
- if: $CI_COMMIT_BRANCH =~ /^pr-.*$/
when: on_success
- if: $CI_PIPELINE_SOURCE == "schedule" && $CI_PIPELINE_SCHEDULE_DESCRIPTION == "daily-ci"
when: on_success
- when: manual
allow_failure: true
pr_full:
extends: .kubevirt
stage: deploy-extended
rules:
- if: $PR_LABELS =~ /.*ci-full.*/
when: on_success
- if: $CI_PIPELINE_SOURCE == "schedule" && $CI_PIPELINE_SCHEDULE_DESCRIPTION == "daily-ci"
when: on_success
# Else run as manual
- when: manual
allow_failure: true
parallel:
matrix:
- TESTCASE:
- almalinux9-calico-ha-ebpf
- almalinux9-calico-nodelocaldns-secondary
- debian11-custom-cni
- debian11-kubelet-csr-approver
- debian12-custom-cni-helm
- fedora39-calico-swap-selinux
- fedora39-crio
- ubuntu20-all-in-one-docker
- ubuntu20-calico-ha-wireguard
- ubuntu20-flannel-ha
- ubuntu20-flannel-ha-once
# Need an update of the container image to use schema v2
# update: quay.io/kubespray/vm-amazon-linux-2:latest
manual:
extends: pr_full
parallel:
matrix:
- TESTCASE:
- amazon-linux-2-all-in-one
rules:
- when: manual
allow_failure: true
pr_extended:
extends: .kubevirt
stage: deploy-extended
rules:
- if: $PR_LABELS =~ /.*(ci-extended|ci-full).*/
when: on_success
- if: $CI_PIPELINE_SOURCE == "schedule" && $CI_PIPELINE_SCHEDULE_DESCRIPTION == "daily-ci"
when: on_success
- when: manual
allow_failure: true
parallel:
matrix:
- TESTCASE:
- almalinux9-calico
- almalinux9-calico-remove-node
- almalinux9-docker
- debian11-docker
- debian12-calico
- debian12-docker
- opensuse15-6-docker-cilium
- rockylinux9-calico
- ubuntu20-calico-etcd-kubeadm
- ubuntu20-flannel
- ubuntu22-all-in-one-docker
- ubuntu24-all-in-one-docker
- ubuntu24-calico-all-in-one
# TODO: migrate to pr-full, fix the broken ones
periodic:
allow_failure: true
extends: .kubevirt
rules:
- if: $CI_PIPELINE_SOURCE == "schedule" && $CI_PIPELINE_SCHEDULE_DESCRIPTION == "daily-ci"
when: on_success
parallel:
matrix:
- TESTCASE:
- debian11-calico-upgrade
- debian11-calico-upgrade-once
- debian12-cilium-svc-proxy
- fedora39-calico-selinux
- fedora40-docker-calico
- ubuntu20-calico-etcd-kubeadm-upgrade-ha
- ubuntu20-calico-ha-recover
- ubuntu20-calico-ha-recover-noquorum

View File

@@ -1,19 +1,24 @@
---
.molecule:
tags: [ffci]
only: [/^pr-.*$/]
except: ['triggers']
rules: # run on ci-short as well
- if: $CI_COMMIT_BRANCH =~ /^pr-.*$/
when: on_success
- if: $CI_PIPELINE_SOURCE == "schedule" && $CI_PIPELINE_SCHEDULE_DESCRIPTION == "daily-ci"
when: on_success
- when: manual
allow_failure: true
stage: deploy-part1
image: $PIPELINE_IMAGE
needs:
- pipeline-image
# - ci-not-authorized
before_script:
- ./tests/scripts/rebase.sh
script:
- ./tests/scripts/molecule_run.sh
after_script:
- ./tests/scripts/molecule_logs.sh
- rm -fr molecule_logs
- mkdir -p molecule_logs
- find ~/.cache/molecule/ \( -name '*.out' -o -name '*.err' \) -type f | xargs tar -uf molecule_logs/molecule.tar
- gzip molecule_logs/molecule.tar
artifacts:
when: always
paths:
@@ -31,25 +36,19 @@ molecule:
- container-engine/cri-o
- adduser
- bastion-ssh-config
- bootstrap-os
- bootstrap_os
# CI template for periodic CI jobs
# Enabled when PERIODIC_CI_ENABLED var is set
molecule_full:
only:
variables:
- $PERIODIC_CI_ENABLED
allow_failure: true
rules:
- if: $CI_PIPELINE_SOURCE == "schedule" && $CI_PIPELINE_SCHEDULE_DESCRIPTION == "daily-ci"
when: on_success
- when: manual
allow_failure: true
extends: molecule
parallel:
matrix:
- ROLE:
- container-engine/cri-dockerd
- container-engine/containerd
- container-engine/cri-o
- adduser
- bastion-ssh-config
- bootstrap-os
# FIXME : tests below are perma-failing
- container-engine/kata-containers
- container-engine/gvisor

View File

@@ -1,254 +0,0 @@
---
.packet:
extends: .testcases
variables:
ANSIBLE_TIMEOUT: "120"
CI_PLATFORM: packet
SSH_USER: kubespray
tags:
- ffci
needs:
- pipeline-image
- ci-not-authorized
# CI template for PRs
.packet_pr:
stage: deploy-part1
rules:
- if: $PR_LABELS =~ /.*ci-short.*/
when: manual
allow_failure: true
- if: $CI_COMMIT_BRANCH =~ /^pr-.*$/
when: on_success
- when: manual
allow_failure: true
extends: .packet
## Uncomment this to have multiple stages
# needs:
# - packet_ubuntu20-calico-all-in-one
.packet_pr_short:
stage: deploy-part1
extends: .packet
rules:
- if: $CI_COMMIT_BRANCH =~ /^pr-.*$/
when: on_success
- when: manual
allow_failure: true
.packet_pr_manual:
extends: .packet_pr
stage: deploy-extended
rules:
- if: $PR_LABELS =~ /.*ci-full.*/
when: on_success
# Else run as manual
- when: manual
allow_failure: true
.packet_pr_extended:
extends: .packet_pr
stage: deploy-extended
rules:
- if: $PR_LABELS =~ /.*(ci-extended|ci-full).*/
when: on_success
- when: manual
allow_failure: true
# CI template for periodic CI jobs
# Enabled when PERIODIC_CI_ENABLED var is set
.packet_periodic:
only:
variables:
- $PERIODIC_CI_ENABLED
allow_failure: true
extends: .packet
# The ubuntu20-calico-all-in-one jobs are meant as early stages to prevent running the full CI if something is horribly broken
packet_ubuntu20-calico-all-in-one:
stage: deploy-part1
extends: .packet_pr_short
variables:
RESET_CHECK: "true"
# ### PR JOBS PART2
packet_ubuntu20-crio:
extends: .packet_pr_manual
packet_ubuntu22-calico-all-in-one:
extends: .packet_pr
packet_ubuntu22-calico-all-in-one-upgrade:
extends: .packet_pr
variables:
UPGRADE_TEST: graceful
packet_ubuntu24-calico-etcd-datastore:
extends: .packet_pr
packet_almalinux9-crio:
extends: .packet_pr
packet_almalinux9-kube-ovn:
extends: .packet_pr
packet_debian11-calico-collection:
extends: .packet_pr
packet_debian11-macvlan:
extends: .packet_pr
packet_debian12-cilium:
extends: .packet_pr
packet_almalinux8-calico:
extends: .packet_pr
packet_rockylinux8-calico:
extends: .packet_pr
packet_rockylinux9-cilium:
extends: .packet_pr
variables:
RESET_CHECK: "true"
# Need an update of the container image to use schema v2
# update: quay.io/kubespray/vm-amazon-linux-2:latest
packet_amazon-linux-2-all-in-one:
extends: .packet_pr_manual
rules:
- when: manual
allow_failure: true
packet_opensuse-docker-cilium:
extends: .packet_pr
packet_ubuntu20-cilium-sep:
extends: .packet_pr
packet_openeuler24-calico:
extends: .packet_pr
packet_ubuntu20-calico-all-in-one-hardening:
extends: .packet_pr
## Extended
packet_debian11-docker:
extends: .packet_pr_extended
packet_debian12-docker:
extends: .packet_pr_extended
packet_debian12-calico:
extends: .packet_pr_extended
packet_almalinux9-calico-remove-node:
extends: .packet_pr_extended
variables:
REMOVE_NODE_CHECK: "true"
REMOVE_NODE_NAME: "instance-3"
packet_rockylinux9-calico:
extends: .packet_pr_extended
packet_almalinux9-calico:
extends: .packet_pr_extended
packet_almalinux9-docker:
extends: .packet_pr_extended
packet_ubuntu24-calico-all-in-one:
extends: .packet_pr_extended
packet_ubuntu20-calico-etcd-kubeadm:
extends: .packet_pr_extended
packet_ubuntu24-all-in-one-docker:
extends: .packet_pr_extended
packet_ubuntu22-all-in-one-docker:
extends: .packet_pr_extended
# ### MANUAL JOBS
packet_fedora39-crio:
extends: .packet_pr_manual
packet_ubuntu20-flannel-ha:
extends: .packet_pr_manual
packet_ubuntu20-all-in-one-docker:
extends: .packet_pr_manual
packet_ubuntu20-flannel-ha-once:
extends: .packet_pr_manual
packet_fedora39-calico-swap-selinux:
extends: .packet_pr_manual
packet_almalinux9-calico-ha-ebpf:
extends: .packet_pr_manual
packet_almalinux9-calico-nodelocaldns-secondary:
extends: .packet_pr_manual
packet_debian11-custom-cni:
extends: .packet_pr_manual
packet_debian11-kubelet-csr-approver:
extends: .packet_pr_manual
packet_debian12-custom-cni-helm:
extends: .packet_pr_manual
packet_ubuntu20-calico-ha-wireguard:
extends: .packet_pr_manual
# PERIODIC
packet_fedora40-docker-calico:
stage: deploy-extended
extends: .packet_periodic
variables:
RESET_CHECK: "true"
packet_fedora39-calico-selinux:
stage: deploy-extended
extends: .packet_periodic
packet_ubuntu20-calico-etcd-kubeadm-upgrade-ha:
stage: deploy-extended
extends: .packet_periodic
variables:
UPGRADE_TEST: basic
packet_debian11-calico-upgrade-once:
stage: deploy-extended
extends: .packet_periodic
variables:
UPGRADE_TEST: graceful
packet_ubuntu20-calico-ha-recover:
stage: deploy-extended
extends: .packet_periodic
variables:
RECOVER_CONTROL_PLANE_TEST: "true"
RECOVER_CONTROL_PLANE_TEST_GROUPS: "etcd[2:]:kube_control_plane[1:]"
packet_ubuntu20-calico-ha-recover-noquorum:
stage: deploy-extended
extends: .packet_periodic
variables:
RECOVER_CONTROL_PLANE_TEST: "true"
RECOVER_CONTROL_PLANE_TEST_GROUPS: "etcd[1:]:kube_control_plane[1:]"
packet_debian11-calico-upgrade:
stage: deploy-extended
extends: .packet_periodic
variables:
UPGRADE_TEST: graceful
packet_debian12-cilium-svc-proxy:
stage: deploy-extended
extends: .packet_periodic

View File

@@ -3,30 +3,22 @@
.terraform_install:
extends: .job
needs:
- ci-not-authorized
- pipeline-image
variables:
TF_VAR_public_key_path: "${ANSIBLE_PRIVATE_KEY_FILE}.pub"
TF_VAR_ssh_private_key_path: $ANSIBLE_PRIVATE_KEY_FILE
CLUSTER: $CI_COMMIT_REF_NAME
TERRAFORM_STATE_ROOT: $CI_PROJECT_DIR
stage: deploy-part1
before_script:
- update-alternatives --install /usr/bin/python python /usr/bin/python3 1
- ./tests/scripts/rebase.sh
- ./tests/scripts/testcases_prepare.sh
- mkdir -p cluster-dump $ANSIBLE_INVENTORY
- ./tests/scripts/terraform_install.sh
# Set Ansible config
- cp ansible.cfg ~/.ansible.cfg
# Prepare inventory
- cp contrib/terraform/$PROVIDER/sample-inventory/cluster.tfvars .
- ln -s contrib/terraform/$PROVIDER/hosts
- ln -rs -t $ANSIBLE_INVENTORY contrib/terraform/$PROVIDER/hosts
- terraform -chdir="contrib/terraform/$PROVIDER" init
# Copy SSH keypair
- mkdir -p ~/.ssh
- echo "$PACKET_PRIVATE_KEY" | base64 -d > ~/.ssh/id_rsa
- chmod 400 ~/.ssh/id_rsa
- echo "$PACKET_PUBLIC_KEY" | base64 -d > ~/.ssh/id_rsa.pub
- mkdir -p contrib/terraform/$PROVIDER/group_vars
# Random subnet to avoid routing conflicts
- export TF_VAR_subnet_cidr="10.$(( $RANDOM % 256 )).$(( $RANDOM % 256 )).0/24"
.terraform_validate:
terraform_validate:
extends: .terraform_install
tags: [ffci]
only: ['master', /^pr-.*$/]
@@ -36,6 +28,16 @@
stage: test
needs:
- pipeline-image
parallel:
matrix:
- PROVIDER:
- openstack
- aws
- exoscale
- hetzner
- vsphere
- upcloud
- nifcloud
.terraform_apply:
extends: .terraform_install
@@ -43,99 +45,24 @@
stage: deploy-extended
when: manual
only: [/^pr-.*$/]
artifacts:
when: always
paths:
- cluster-dump/
variables:
ANSIBLE_INVENTORY_UNPARSED_FAILED: "true"
ANSIBLE_INVENTORY: hosts
CI_PLATFORM: tf
TF_VAR_ssh_user: $SSH_USER
ANSIBLE_REMOTE_USER: ubuntu # the openstack terraform module does not handle custom user correctly
ANSIBLE_SSH_RETRIES: 15
TF_VAR_ssh_user: $ANSIBLE_REMOTE_USER
TF_VAR_cluster_name: $CI_JOB_ID
script:
# Set Ansible config
- cp ansible.cfg ~/.ansible.cfg
- ssh-keygen -N '' -f $ANSIBLE_PRIVATE_KEY_FILE -t rsa
- mkdir -p contrib/terraform/$PROVIDER/group_vars
# Random subnet to avoid routing conflicts
- export TF_VAR_subnet_cidr="10.$(( $RANDOM % 256 )).$(( $RANDOM % 256 )).0/24"
- terraform -chdir="contrib/terraform/$PROVIDER" apply -auto-approve -parallelism=1
- tests/scripts/testcases_run.sh
after_script:
# Cleanup regardless of exit code
- ./tests/scripts/testcases_cleanup.sh
tf-validate-openstack:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_VERSION
PROVIDER: openstack
CLUSTER: $CI_COMMIT_REF_NAME
tf-validate-equinix:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_VERSION
PROVIDER: equinix
CLUSTER: $CI_COMMIT_REF_NAME
tf-validate-aws:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_VERSION
PROVIDER: aws
CLUSTER: $CI_COMMIT_REF_NAME
tf-validate-exoscale:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_VERSION
PROVIDER: exoscale
tf-validate-hetzner:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_VERSION
PROVIDER: hetzner
tf-validate-vsphere:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_VERSION
PROVIDER: vsphere
CLUSTER: $CI_COMMIT_REF_NAME
tf-validate-upcloud:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_VERSION
PROVIDER: upcloud
CLUSTER: $CI_COMMIT_REF_NAME
tf-validate-nifcloud:
extends: .terraform_validate
variables:
TF_VERSION: $TERRAFORM_VERSION
PROVIDER: nifcloud
# tf-packet-ubuntu20-default:
# extends: .terraform_apply
# variables:
# TF_VERSION: $TERRAFORM_VERSION
# PROVIDER: packet
# CLUSTER: $CI_COMMIT_REF_NAME
# TF_VAR_number_of_k8s_masters: "1"
# TF_VAR_number_of_k8s_nodes: "1"
# TF_VAR_plan_k8s_masters: t1.small.x86
# TF_VAR_plan_k8s_nodes: t1.small.x86
# TF_VAR_metro: am
# TF_VAR_public_key_path: ""
# TF_VAR_operating_system: ubuntu_20_04
.ovh_variables: &ovh_variables
OS_AUTH_URL: https://auth.cloud.ovh.net/v3
OS_PROJECT_ID: 8d3cd5d737d74227ace462dee0b903fe
OS_PROJECT_NAME: "9361447987648822"
OS_USER_DOMAIN_NAME: Default
OS_PROJECT_DOMAIN_ID: default
OS_USERNAME: 8XuhBMfkKVrk
OS_REGION_NAME: UK1
OS_INTERFACE: public
OS_IDENTITY_API_VERSION: "3"
- terraform -chdir="contrib/terraform/$PROVIDER" destroy -auto-approve
# Elastx is generously donating resources for Kubespray on Openstack CI
# Contacts: @gix @bl0m1
@@ -169,11 +96,8 @@ tf-elastx_ubuntu20-calico:
allow_failure: true
variables:
<<: *elastx_variables
TF_VERSION: $TERRAFORM_VERSION
PROVIDER: openstack
CLUSTER: $CI_COMMIT_REF_NAME
ANSIBLE_TIMEOUT: "60"
SSH_USER: ubuntu
TF_VAR_number_of_k8s_masters: "1"
TF_VAR_number_of_k8s_masters_no_floating_ip: "0"
TF_VAR_number_of_k8s_masters_no_floating_ip_no_etcd: "0"
@@ -194,46 +118,3 @@ tf-elastx_ubuntu20-calico:
TF_VAR_flavor_k8s_node: 3f73fc93-ec61-4808-88df-2580d94c1a9b # v1-standard-2
TF_VAR_image: ubuntu-20.04-server-latest
TF_VAR_k8s_allowed_remote_ips: '["0.0.0.0/0"]'
# OVH voucher expired, commenting job until things are sorted out
# tf-ovh_cleanup:
# stage: unit-tests
# tags: [light]
# image: python
# environment: ovh
# variables:
# <<: *ovh_variables
# before_script:
# - pip install -r scripts/openstack-cleanup/requirements.txt
# script:
# - ./scripts/openstack-cleanup/main.py
# tf-ovh_ubuntu20-calico:
# extends: .terraform_apply
# when: on_success
# environment: ovh
# variables:
# <<: *ovh_variables
# TF_VERSION: $TERRAFORM_VERSION
# PROVIDER: openstack
# CLUSTER: $CI_COMMIT_REF_NAME
# ANSIBLE_TIMEOUT: "60"
# SSH_USER: ubuntu
# TF_VAR_number_of_k8s_masters: "0"
# TF_VAR_number_of_k8s_masters_no_floating_ip: "1"
# TF_VAR_number_of_k8s_masters_no_floating_ip_no_etcd: "0"
# TF_VAR_number_of_etcd: "0"
# TF_VAR_number_of_k8s_nodes: "0"
# TF_VAR_number_of_k8s_nodes_no_floating_ip: "1"
# TF_VAR_number_of_gfs_nodes_no_floating_ip: "0"
# TF_VAR_number_of_bastions: "0"
# TF_VAR_number_of_k8s_masters_no_etcd: "0"
# TF_VAR_use_neutron: "0"
# TF_VAR_floatingip_pool: "Ext-Net"
# TF_VAR_external_net: "6011fbc9-4cbf-46a4-8452-6890a340b60b"
# TF_VAR_network_name: "Ext-Net"
# TF_VAR_flavor_k8s_master: "defa64c3-bd46-43b4-858a-d93bbae0a229" # s1-8
# TF_VAR_flavor_k8s_node: "defa64c3-bd46-43b4-858a-d93bbae0a229" # s1-8
# TF_VAR_image: "Ubuntu 20.04"
# TF_VAR_k8s_allowed_remote_ips: '["0.0.0.0/0"]'

View File

@@ -1,20 +1,16 @@
---
.vagrant:
extends: .testcases
needs:
- ci-not-authorized
vagrant:
extends: .job-moderated
variables:
CI_PLATFORM: "vagrant"
SSH_USER: "vagrant"
VAGRANT_DEFAULT_PROVIDER: "libvirt"
KUBESPRAY_VAGRANT_CONFIG: tests/files/${CI_JOB_NAME}.rb
KUBESPRAY_VAGRANT_CONFIG: tests/files/${TESTCASE}.rb
DOCKER_NAME: vagrant
VAGRANT_ANSIBLE_TAGS: facts
VAGRANT_HOME: "$CI_PROJECT_DIR/.vagrant.d"
PIP_CACHE_DIR: "$CI_PROJECT_DIR/.cache/pip"
tags: [ffci-vm-large]
# only: [/^pr-.*$/]
# except: ['triggers']
image: quay.io/kubespray/vm-kubespray-ci:v13
services: []
before_script:
@@ -28,54 +24,24 @@
- pip install --no-compile --no-cache-dir -r $CI_PROJECT_DIR/tests/requirements.txt
- ./tests/scripts/vagrant_clean.sh
script:
- vagrant up
- ./tests/scripts/testcases_run.sh
after_script:
- vagrant destroy -f
cache:
key: $CI_JOB_NAME_SLUG
paths:
- .vagrant.d/boxes
- .cache/pip
policy: pull-push # TODO: change to "pull" when not on main
vagrant_ubuntu24-calico-dual-stack:
stage: deploy-extended
extends: .vagrant
rules:
- if: $PR_LABELS =~ /.*(ci-extended|ci-full).*/
when: on_success
allow_failure: false
vagrant_ubuntu24-calico-ipv6only-stack:
stage: deploy-extended
extends: .vagrant
rules:
- if: $PR_LABELS =~ /.*(ci-extended|ci-full).*/
- if: $CI_PIPELINE_SOURCE == "schedule" && $CI_PIPELINE_SCHEDULE_DESCRIPTION == "daily-ci"
when: on_success
allow_failure: false
vagrant_ubuntu20-flannel:
stage: deploy-part1
extends: .vagrant
when: on_success
allow_failure: false
vagrant_ubuntu20-flannel-collection:
stage: deploy-extended
extends: .vagrant
when: manual
vagrant_ubuntu20-kube-router-sep:
stage: deploy-extended
extends: .vagrant
when: manual
# Service proxy test fails connectivity testing
vagrant_ubuntu20-kube-router-svc-proxy:
stage: deploy-extended
extends: .vagrant
when: manual
vagrant_fedora39-kube-router:
stage: deploy-extended
extends: .vagrant
when: manual
# FIXME: this test if broken (perma-failing)
parallel:
matrix:
- TESTCASE:
- ubuntu24-calico-dual-stack
- ubuntu24-calico-ipv6only-stack

View File

@@ -35,8 +35,8 @@ RUN --mount=type=bind,source=requirements.txt,target=requirements.txt \
SHELL ["/bin/bash", "-o", "pipefail", "-c"]
RUN OS_ARCHITECTURE=$(dpkg --print-architecture) \
&& curl -L "https://dl.k8s.io/release/v1.32.3/bin/linux/${OS_ARCHITECTURE}/kubectl" -o /usr/local/bin/kubectl \
&& echo "$(curl -L "https://dl.k8s.io/release/v1.32.3/bin/linux/${OS_ARCHITECTURE}/kubectl.sha256")" /usr/local/bin/kubectl | sha256sum --check \
&& curl -L "https://dl.k8s.io/release/v1.32.8/bin/linux/${OS_ARCHITECTURE}/kubectl" -o /usr/local/bin/kubectl \
&& echo "$(curl -L "https://dl.k8s.io/release/v1.32.8/bin/linux/${OS_ARCHITECTURE}/kubectl.sha256")" /usr/local/bin/kubectl | sha256sum --check \
&& chmod a+x /usr/local/bin/kubectl
COPY *.yml ./

View File

@@ -1,13 +1,10 @@
aliases:
kubespray-approvers:
- cristicalin
- floryut
- liupeng0518
- mzaian
- oomichi
- yankay
- ant31
- mzaian
- tico88612
- vannten
- yankay
kubespray-reviewers:
- cyclinder
- erikjiang
@@ -19,8 +16,12 @@ aliases:
kubespray-emeritus_approvers:
- atoms
- chadswen
- cristicalin
- floryut
- liupeng0518
- luckysb
- mattymo
- miouge1
- oomichi
- riverzhang
- woopstar

View File

@@ -111,32 +111,30 @@ Note:
<!-- BEGIN ANSIBLE MANAGED BLOCK -->
- Core
- [kubernetes](https://github.com/kubernetes/kubernetes) 1.32.3
- [etcd](https://github.com/etcd-io/etcd) 3.5.16
- [docker](https://www.docker.com/) 26.1
- [containerd](https://containerd.io/) 2.0.3
- [kubernetes](https://github.com/kubernetes/kubernetes) 1.32.8
- [etcd](https://github.com/etcd-io/etcd) 3.5.22
- [docker](https://www.docker.com/) 28.0
- [containerd](https://containerd.io/) 2.0.6
- [cri-o](http://cri-o.io/) 1.32.0 (experimental: see [CRI-O Note](docs/CRI/cri-o.md). Only on fedora, ubuntu and centos based OS)
- Network Plugin
- [cni-plugins](https://github.com/containernetworking/plugins) 1.4.1
- [calico](https://github.com/projectcalico/calico) 3.29.2
- [cilium](https://github.com/cilium/cilium) 1.15.9
- [calico](https://github.com/projectcalico/calico) 3.29.5
- [cilium](https://github.com/cilium/cilium) 1.17.7
- [flannel](https://github.com/flannel-io/flannel) 0.22.0
- [kube-ovn](https://github.com/alauda/kube-ovn) 1.12.21
- [kube-router](https://github.com/cloudnativelabs/kube-router) 2.0.0
- [kube-router](https://github.com/cloudnativelabs/kube-router) 2.1.1
- [multus](https://github.com/k8snetworkplumbingwg/multus-cni) 4.1.0
- [weave](https://github.com/rajch/weave) 2.8.7
- [kube-vip](https://github.com/kube-vip/kube-vip) 0.8.0
- Application
- [cert-manager](https://github.com/jetstack/cert-manager) 1.15.3
- [coredns](https://github.com/coredns/coredns) 1.11.3
- [ingress-nginx](https://github.com/kubernetes/ingress-nginx) 1.12.0
- [ingress-nginx](https://github.com/kubernetes/ingress-nginx) 1.12.1
- [argocd](https://argoproj.github.io/) 2.14.5
- [helm](https://helm.sh/) 3.16.4
- [metallb](https://metallb.universe.tf/) 0.13.9
- [registry](https://github.com/distribution/distribution) 2.8.1
- Storage Plugin
- [cephfs-provisioner](https://github.com/kubernetes-incubator/external-storage) 2.1.0-k8s1.11
- [rbd-provisioner](https://github.com/kubernetes-incubator/external-storage) 2.1.1-k8s1.11
- [aws-ebs-csi-plugin](https://github.com/kubernetes-sigs/aws-ebs-csi-driver) 0.5.0
- [azure-csi-plugin](https://github.com/kubernetes-sigs/azuredisk-csi-driver) 1.10.0
- [cinder-csi-plugin](https://github.com/kubernetes/cloud-provider-openstack/blob/master/docs/cinder-csi-plugin/using-cinder-csi-plugin.md) 1.30.0

View File

@@ -45,7 +45,7 @@ The Kubespray Project is released on an as-needed basis. The process is as follo
* Minor releases can change components' versions, but not the major `kube_version`.
Greater `kube_version` requires a new major or minor release. For example, if Kubespray v2.0.0
is bound to `kube_version: 1.4.x`, `calico_version: 0.22.0`, `etcd_version: v3.0.6`,
is bound to `kube_version: 1.4.x`, `calico_version: 0.22.0`, `etcd_version: 3.0.6`,
then Kubespray v2.1.0 may be bound to only minor changes to `kube_version`, like v1.5.1
and *any* changes to other components, like etcd v4, or calico 1.2.3.
And Kubespray v3.x.x shall be bound to `kube_version: 2.x.x` respectively.

2
Vagrantfile vendored
View File

@@ -33,7 +33,7 @@ SUPPORTED_OS = {
"fedora40" => {box: "fedora/40-cloud-base", user: "vagrant"},
"fedora39-arm64" => {box: "bento/fedora-39-arm64", user: "vagrant"},
"fedora40-arm64" => {box: "bento/fedora-40", user: "vagrant"},
"opensuse" => {box: "opensuse/Leap-15.4.x86_64", user: "vagrant"},
"opensuse" => {box: "opensuse/Leap-15.6.x86_64", user: "vagrant"},
"opensuse-tumbleweed" => {box: "opensuse/Tumbleweed.x86_64", user: "vagrant"},
"oraclelinux" => {box: "generic/oracle7", user: "vagrant"},
"oraclelinux8" => {box: "generic/oracle8", user: "vagrant"},

View File

@@ -1,11 +0,0 @@
# Kubespray on KVM Virtual Machines hypervisor preparation
A simple playbook to ensure your system has the right settings to enable Kubespray
deployment on VMs.
This playbook does not create Virtual Machines, nor does it run Kubespray itself.
## User creation
If you want to create a user for running Kubespray deployment, you should specify
both `k8s_deployment_user` and `k8s_deployment_user_pkey_path`.

View File

@@ -1,2 +0,0 @@
#k8s_deployment_user: kubespray
#k8s_deployment_user_pkey_path: /tmp/ssh_rsa

View File

@@ -1,9 +0,0 @@
---
- name: Prepare Hypervisor to later install kubespray VMs
hosts: localhost
gather_facts: false
become: true
vars:
bootstrap_os: none
roles:
- { role: kvm-setup }

View File

@@ -1,30 +0,0 @@
---
- name: Install required packages
package:
name: "{{ item }}"
state: present
with_items:
- bind-utils
- ntp
when: ansible_os_family == "RedHat"
- name: Install required packages
apt:
upgrade: true
update_cache: true
cache_valid_time: 3600
name: "{{ item }}"
state: present
install_recommends: false
with_items:
- dnsutils
- ntp
when: ansible_os_family == "Debian"
- name: Create deployment user if required
include_tasks: user.yml
when: k8s_deployment_user is defined
- name: Set proper sysctl values
import_tasks: sysctl.yml

View File

@@ -1,46 +0,0 @@
---
- name: Load br_netfilter module
community.general.modprobe:
name: br_netfilter
state: present
register: br_netfilter
- name: Add br_netfilter into /etc/modules
lineinfile:
dest: /etc/modules
state: present
line: 'br_netfilter'
when: br_netfilter is defined and ansible_os_family == 'Debian'
- name: Add br_netfilter into /etc/modules-load.d/kubespray.conf
copy:
dest: /etc/modules-load.d/kubespray.conf
content: |-
### This file is managed by Ansible
br-netfilter
owner: root
group: root
mode: "0644"
when: br_netfilter is defined
- name: Enable net.ipv4.ip_forward in sysctl
ansible.posix.sysctl:
name: net.ipv4.ip_forward
value: 1
sysctl_file: "{{ sysctl_file_path }}"
state: present
reload: true
- name: Set bridge-nf-call-{arptables,iptables} to 0
ansible.posix.sysctl:
name: "{{ item }}"
state: present
value: 0
sysctl_file: "{{ sysctl_file_path }}"
reload: true
with_items:
- net.bridge.bridge-nf-call-arptables
- net.bridge.bridge-nf-call-ip6tables
- net.bridge.bridge-nf-call-iptables
when: br_netfilter is defined

View File

@@ -1,47 +0,0 @@
---
- name: Create user {{ k8s_deployment_user }}
user:
name: "{{ k8s_deployment_user }}"
groups: adm
shell: /bin/bash
- name: Ensure that .ssh exists
file:
path: "/home/{{ k8s_deployment_user }}/.ssh"
state: directory
owner: "{{ k8s_deployment_user }}"
group: "{{ k8s_deployment_user }}"
mode: "0700"
- name: Configure sudo for deployment user
copy:
content: |
%{{ k8s_deployment_user }} ALL=(ALL) NOPASSWD: ALL
dest: "/etc/sudoers.d/55-k8s-deployment"
owner: root
group: root
mode: "0644"
- name: Write private SSH key
copy:
src: "{{ k8s_deployment_user_pkey_path }}"
dest: "/home/{{ k8s_deployment_user }}/.ssh/id_rsa"
mode: "0400"
owner: "{{ k8s_deployment_user }}"
group: "{{ k8s_deployment_user }}"
when: k8s_deployment_user_pkey_path is defined
- name: Write public SSH key
shell: "ssh-keygen -y -f /home/{{ k8s_deployment_user }}/.ssh/id_rsa \
> /home/{{ k8s_deployment_user }}/.ssh/authorized_keys"
args:
creates: "/home/{{ k8s_deployment_user }}/.ssh/authorized_keys"
when: k8s_deployment_user_pkey_path is defined
- name: Fix ssh-pub-key permissions
file:
path: "/home/{{ k8s_deployment_user }}/.ssh/authorized_keys"
mode: "0600"
owner: "{{ k8s_deployment_user }}"
group: "{{ k8s_deployment_user }}"
when: k8s_deployment_user_pkey_path is defined

View File

@@ -1,15 +0,0 @@
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: kubernetes-dashboard
labels:
k8s-app: kubernetes-dashboard
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: cluster-admin
subjects:
- kind: ServiceAccount
name: kubernetes-dashboard
namespace: kube-system

View File

@@ -1,51 +0,0 @@
---
- name: Check ansible version
import_playbook: kubernetes_sigs.kubespray.ansible_version
- name: Install mitogen
hosts: localhost
strategy: linear
vars:
mitogen_version: 0.3.2
mitogen_url: https://github.com/mitogen-hq/mitogen/archive/refs/tags/v{{ mitogen_version }}.tar.gz
ansible_connection: local
tasks:
- name: Create mitogen plugin dir
file:
path: "{{ item }}"
state: directory
mode: "0755"
become: false
loop:
- "{{ playbook_dir }}/plugins/mitogen"
- "{{ playbook_dir }}/dist"
- name: Download mitogen release
get_url:
url: "{{ mitogen_url }}"
dest: "{{ playbook_dir }}/dist/mitogen_{{ mitogen_version }}.tar.gz"
validate_certs: true
mode: "0644"
- name: Extract archive
unarchive:
src: "{{ playbook_dir }}/dist/mitogen_{{ mitogen_version }}.tar.gz"
dest: "{{ playbook_dir }}/dist/"
- name: Copy plugin
ansible.posix.synchronize:
src: "{{ playbook_dir }}/dist/mitogen-{{ mitogen_version }}/"
dest: "{{ playbook_dir }}/plugins/mitogen"
- name: Add strategy to ansible.cfg
community.general.ini_file:
path: ansible.cfg
mode: "0644"
section: "{{ item.section | d('defaults') }}"
option: "{{ item.option }}"
value: "{{ item.value }}"
with_items:
- option: strategy
value: mitogen_linear
- option: strategy_plugins
value: plugins/mitogen/ansible_mitogen/plugins/strategy

View File

@@ -1,92 +0,0 @@
# Deploying a Kubespray Kubernetes Cluster with GlusterFS
You can either deploy using Ansible on its own by supplying your own inventory file or by using Terraform to create the VMs and then providing a dynamic inventory to Ansible. The following two sections are self-contained, you don't need to go through one to use the other. So, if you want to provision with Terraform, you can skip the **Using an Ansible inventory** section, and if you want to provision with a pre-built ansible inventory, you can neglect the **Using Terraform and Ansible** section.
## Using an Ansible inventory
In the same directory of this ReadMe file you should find a file named `inventory.example` which contains an example setup. Please note that, additionally to the Kubernetes nodes/masters, we define a set of machines for GlusterFS and we add them to the group `[gfs-cluster]`, which in turn is added to the larger `[network-storage]` group as a child group.
Change that file to reflect your local setup (adding more machines or removing them and setting the adequate ip numbers), and save it to `inventory/sample/k8s_gfs_inventory`. Make sure that the settings on `inventory/sample/group_vars/all.yml` make sense with your deployment. Then execute change to the kubespray root folder, and execute (supposing that the machines are all using ubuntu):
```shell
ansible-playbook -b --become-user=root -i inventory/sample/k8s_gfs_inventory --user=ubuntu ./cluster.yml
```
This will provision your Kubernetes cluster. Then, to provision and configure the GlusterFS cluster, from the same directory execute:
```shell
ansible-playbook -b --become-user=root -i inventory/sample/k8s_gfs_inventory --user=ubuntu ./contrib/network-storage/glusterfs/glusterfs.yml
```
If your machines are not using Ubuntu, you need to change the `--user=ubuntu` to the correct user. Alternatively, if your Kubernetes machines are using one OS and your GlusterFS a different one, you can instead specify the `ansible_ssh_user=<correct-user>` variable in the inventory file that you just created, for each machine/VM:
```shell
k8s-master-1 ansible_ssh_host=192.168.0.147 ip=192.168.0.147 ansible_ssh_user=core
k8s-master-node-1 ansible_ssh_host=192.168.0.148 ip=192.168.0.148 ansible_ssh_user=core
k8s-master-node-2 ansible_ssh_host=192.168.0.146 ip=192.168.0.146 ansible_ssh_user=core
```
## Using Terraform and Ansible
First step is to fill in a `my-kubespray-gluster-cluster.tfvars` file with the specification desired for your cluster. An example with all required variables would look like:
```ini
cluster_name = "cluster1"
number_of_k8s_masters = "1"
number_of_k8s_masters_no_floating_ip = "2"
number_of_k8s_nodes_no_floating_ip = "0"
number_of_k8s_nodes = "0"
public_key_path = "~/.ssh/my-desired-key.pub"
image = "Ubuntu 16.04"
ssh_user = "ubuntu"
flavor_k8s_node = "node-flavor-id-in-your-openstack"
flavor_k8s_master = "master-flavor-id-in-your-openstack"
network_name = "k8s-network"
floatingip_pool = "net_external"
# GlusterFS variables
flavor_gfs_node = "gluster-flavor-id-in-your-openstack"
image_gfs = "Ubuntu 16.04"
number_of_gfs_nodes_no_floating_ip = "3"
gfs_volume_size_in_gb = "50"
ssh_user_gfs = "ubuntu"
```
As explained in the general terraform/openstack guide, you need to source your OpenStack credentials file, add your ssh-key to the ssh-agent and setup environment variables for terraform:
```shell
$ source ~/.stackrc
$ eval $(ssh-agent -s)
$ ssh-add ~/.ssh/my-desired-key
$ echo Setting up Terraform creds && \
export TF_VAR_username=${OS_USERNAME} && \
export TF_VAR_password=${OS_PASSWORD} && \
export TF_VAR_tenant=${OS_TENANT_NAME} && \
export TF_VAR_auth_url=${OS_AUTH_URL}
```
Then, standing on the kubespray directory (root base of the Git checkout), issue the following terraform command to create the VMs for the cluster:
```shell
terraform apply -state=contrib/terraform/openstack/terraform.tfstate -var-file=my-kubespray-gluster-cluster.tfvars contrib/terraform/openstack
```
This will create both your Kubernetes and Gluster VMs. Make sure that the ansible file `contrib/terraform/openstack/group_vars/all.yml` includes any ansible variable that you want to setup (like, for instance, the type of machine for bootstrapping).
Then, provision your Kubernetes (kubespray) cluster with the following ansible call:
```shell
ansible-playbook -b --become-user=root -i contrib/terraform/openstack/hosts ./cluster.yml
```
Finally, provision the glusterfs nodes and add the Persistent Volume setup for GlusterFS in Kubernetes through the following ansible call:
```shell
ansible-playbook -b --become-user=root -i contrib/terraform/openstack/hosts ./contrib/network-storage/glusterfs/glusterfs.yml
```
If you need to destroy the cluster, you can run:
```shell
terraform destroy -state=contrib/terraform/openstack/terraform.tfstate -var-file=my-kubespray-gluster-cluster.tfvars contrib/terraform/openstack
```

View File

@@ -1,29 +0,0 @@
---
- name: Bootstrap hosts
hosts: gfs-cluster
gather_facts: false
vars:
ansible_ssh_pipelining: false
roles:
- { role: bootstrap-os, tags: bootstrap-os}
- name: Gather facts
hosts: all
gather_facts: true
- name: Install glusterfs server
hosts: gfs-cluster
vars:
ansible_ssh_pipelining: true
roles:
- { role: glusterfs/server }
- name: Install glusterfs servers
hosts: k8s_cluster
roles:
- { role: glusterfs/client }
- name: Configure Kubernetes to use glusterfs
hosts: kube_control_plane[0]
roles:
- { role: kubernetes-pv }

View File

@@ -1 +0,0 @@
../../../inventory/local/group_vars

View File

@@ -1,43 +0,0 @@
# ## Configure 'ip' variable to bind kubernetes services on a
# ## different ip than the default iface
# node1 ansible_ssh_host=95.54.0.12 # ip=10.3.0.1
# node2 ansible_ssh_host=95.54.0.13 # ip=10.3.0.2
# node3 ansible_ssh_host=95.54.0.14 # ip=10.3.0.3
# node4 ansible_ssh_host=95.54.0.15 # ip=10.3.0.4
# node5 ansible_ssh_host=95.54.0.16 # ip=10.3.0.5
# node6 ansible_ssh_host=95.54.0.17 # ip=10.3.0.6
#
# ## GlusterFS nodes
# ## Set disk_volume_device_1 to desired device for gluster brick, if different to /dev/vdb (default).
# ## As in the previous case, you can set ip to give direct communication on internal IPs
# gfs_node1 ansible_ssh_host=95.54.0.18 # disk_volume_device_1=/dev/vdc ip=10.3.0.7
# gfs_node2 ansible_ssh_host=95.54.0.19 # disk_volume_device_1=/dev/vdc ip=10.3.0.8
# gfs_node3 ansible_ssh_host=95.54.0.20 # disk_volume_device_1=/dev/vdc ip=10.3.0.9
# [kube_control_plane]
# node1
# node2
# [etcd]
# node1
# node2
# node3
# [kube_node]
# node2
# node3
# node4
# node5
# node6
# [k8s_cluster:children]
# kube_node
# kube_control_plane
# [gfs-cluster]
# gfs_node1
# gfs_node2
# gfs_node3
# [network-storage:children]
# gfs-cluster

View File

@@ -1 +0,0 @@
../../../../roles/bootstrap-os

View File

@@ -1,50 +0,0 @@
# Ansible Role: GlusterFS
[![Build Status](https://travis-ci.org/geerlingguy/ansible-role-glusterfs.svg?branch=master)](https://travis-ci.org/geerlingguy/ansible-role-glusterfs)
Installs and configures GlusterFS on Linux.
## Requirements
For GlusterFS to connect between servers, TCP ports `24007`, `24008`, and `24009`/`49152`+ (that port, plus an additional incremented port for each additional server in the cluster; the latter if GlusterFS is version 3.4+), and TCP/UDP port `111` must be open. You can open these using whatever firewall you wish (this can easily be configured using the `geerlingguy.firewall` role).
This role performs basic installation and setup of Gluster, but it does not configure or mount bricks (volumes), since that step is easier to do in a series of plays in your own playbook. Ansible 1.9+ includes the [`gluster_volume`](https://docs.ansible.com/ansible/latest/collections/gluster/gluster/gluster_volume_module.html) module to ease the management of Gluster volumes.
## Role Variables
Available variables are listed below, along with default values (see `defaults/main.yml`):
```yaml
glusterfs_default_release: ""
```
You can specify a `default_release` for apt on Debian/Ubuntu by overriding this variable. This is helpful if you need a different package or version for the main GlusterFS packages (e.g. GlusterFS 3.5.x instead of 3.2.x with the `wheezy-backports` default release on Debian Wheezy).
```yaml
glusterfs_ppa_use: true
glusterfs_ppa_version: "3.5"
```
For Ubuntu, specify whether to use the official Gluster PPA, and which version of the PPA to use. See Gluster's [Getting Started Guide](https://docs.gluster.org/en/latest/Quick-Start-Guide/Quickstart/) for more info.
## Dependencies
None.
## Example Playbook
```yaml
- hosts: server
roles:
- geerlingguy.glusterfs
```
For a real-world use example, read through [Simple GlusterFS Setup with Ansible](http://www.jeffgeerling.com/blog/simple-glusterfs-setup-ansible), a blog post by this role's author, which is included in Chapter 8 of [Ansible for DevOps](https://www.ansiblefordevops.com/).
## License
MIT / BSD
## Author Information
This role was created in 2015 by [Jeff Geerling](http://www.jeffgeerling.com/), author of [Ansible for DevOps](https://www.ansiblefordevops.com/).

View File

@@ -1,11 +0,0 @@
---
# For Ubuntu.
glusterfs_default_release: ""
glusterfs_ppa_use: true
glusterfs_ppa_version: "4.1"
# Gluster configuration.
gluster_mount_dir: /mnt/gluster
gluster_volume_node_mount_dir: /mnt/xfs-drive-gluster
gluster_brick_dir: "{{ gluster_volume_node_mount_dir }}/brick"
gluster_brick_name: gluster

View File

@@ -1,30 +0,0 @@
---
dependencies: []
galaxy_info:
author: geerlingguy
description: GlusterFS installation for Linux.
company: "Midwestern Mac, LLC"
license: "license (BSD, MIT)"
min_ansible_version: "2.0"
platforms:
- name: EL
versions:
- "6"
- "7"
- name: Ubuntu
versions:
- precise
- trusty
- xenial
- name: Debian
versions:
- wheezy
- jessie
galaxy_tags:
- system
- networking
- cloud
- clustering
- files
- sharing

View File

@@ -1,21 +0,0 @@
---
# This is meant for Ubuntu and RedHat installations, where apparently the glusterfs-client is not used from inside
# hyperkube and needs to be installed as part of the system.
# Setup/install tasks.
- name: Setup RedHat distros for glusterfs
include_tasks: setup-RedHat.yml
when: ansible_os_family == 'RedHat' and groups['gfs-cluster'] is defined
- name: Setup Debian distros for glusterfs
include_tasks: setup-Debian.yml
when: ansible_os_family == 'Debian' and groups['gfs-cluster'] is defined
- name: Ensure Gluster mount directories exist.
file:
path: "{{ item }}"
state: directory
mode: "0775"
with_items:
- "{{ gluster_mount_dir }}"
when: ansible_os_family in ["Debian","RedHat"] and groups['gfs-cluster'] is defined

View File

@@ -1,24 +0,0 @@
---
- name: Add PPA for GlusterFS.
apt_repository:
repo: 'ppa:gluster/glusterfs-{{ glusterfs_ppa_version }}'
state: present
update_cache: true
register: glusterfs_ppa_added
when: glusterfs_ppa_use
- name: Ensure GlusterFS client will reinstall if the PPA was just added. # noqa no-handler
apt:
name: "{{ item }}"
state: absent
with_items:
- glusterfs-client
when: glusterfs_ppa_added.changed
- name: Ensure GlusterFS client is installed.
apt:
name: "{{ item }}"
state: present
default_release: "{{ glusterfs_default_release }}"
with_items:
- glusterfs-client

View File

@@ -1,14 +0,0 @@
---
- name: Install Prerequisites
package:
name: "{{ item }}"
state: present
with_items:
- "centos-release-gluster{{ glusterfs_default_release }}"
- name: Install Packages
package:
name: "{{ item }}"
state: present
with_items:
- glusterfs-client

View File

@@ -1,13 +0,0 @@
---
# For Ubuntu.
glusterfs_default_release: ""
glusterfs_ppa_use: true
glusterfs_ppa_version: "3.12"
# Gluster configuration.
gluster_mount_dir: /mnt/gluster
gluster_volume_node_mount_dir: /mnt/xfs-drive-gluster
gluster_brick_dir: "{{ gluster_volume_node_mount_dir }}/brick"
gluster_brick_name: gluster
# Default device to mount for xfs formatting, terraform overrides this by setting the variable in the inventory.
disk_volume_device_1: /dev/vdb

View File

@@ -1,30 +0,0 @@
---
dependencies: []
galaxy_info:
author: geerlingguy
description: GlusterFS installation for Linux.
company: "Midwestern Mac, LLC"
license: "license (BSD, MIT)"
min_ansible_version: "2.0"
platforms:
- name: EL
versions:
- "6"
- "7"
- name: Ubuntu
versions:
- precise
- trusty
- xenial
- name: Debian
versions:
- wheezy
- jessie
galaxy_tags:
- system
- networking
- cloud
- clustering
- files
- sharing

View File

@@ -1,113 +0,0 @@
---
# Include variables and define needed variables.
- name: Include OS-specific variables.
include_vars: "{{ ansible_os_family }}.yml"
# Install xfs package
- name: Install xfs Debian
apt:
name: xfsprogs
state: present
when: ansible_os_family == "Debian"
- name: Install xfs RedHat
package:
name: xfsprogs
state: present
when: ansible_os_family == "RedHat"
# Format external volumes in xfs
- name: Format volumes in xfs
community.general.filesystem:
fstype: xfs
dev: "{{ disk_volume_device_1 }}"
# Mount external volumes
- name: Mounting new xfs filesystem
ansible.posix.mount:
name: "{{ gluster_volume_node_mount_dir }}"
src: "{{ disk_volume_device_1 }}"
fstype: xfs
state: mounted
# Setup/install tasks.
- name: Setup RedHat distros for glusterfs
include_tasks: setup-RedHat.yml
when: ansible_os_family == 'RedHat'
- name: Setup Debian distros for glusterfs
include_tasks: setup-Debian.yml
when: ansible_os_family == 'Debian'
- name: Ensure GlusterFS is started and enabled at boot.
service:
name: "{{ glusterfs_daemon }}"
state: started
enabled: true
- name: Ensure Gluster brick and mount directories exist.
file:
path: "{{ item }}"
state: directory
mode: "0775"
with_items:
- "{{ gluster_brick_dir }}"
- "{{ gluster_mount_dir }}"
- name: Configure Gluster volume with replicas
gluster.gluster.gluster_volume:
state: present
name: "{{ gluster_brick_name }}"
brick: "{{ gluster_brick_dir }}"
replicas: "{{ groups['gfs-cluster'] | length }}"
cluster: "{% for item in groups['gfs-cluster'] -%}{{ hostvars[item]['ip'] | default(hostvars[item].ansible_default_ipv4['address']) }}{% if not loop.last %},{% endif %}{%- endfor %}"
host: "{{ inventory_hostname }}"
force: true
run_once: true
when: groups['gfs-cluster'] | length > 1
- name: Configure Gluster volume without replicas
gluster.gluster.gluster_volume:
state: present
name: "{{ gluster_brick_name }}"
brick: "{{ gluster_brick_dir }}"
cluster: "{% for item in groups['gfs-cluster'] -%}{{ hostvars[item]['ip'] | default(hostvars[item].ansible_default_ipv4['address']) }}{% if not loop.last %},{% endif %}{%- endfor %}"
host: "{{ inventory_hostname }}"
force: true
run_once: true
when: groups['gfs-cluster'] | length <= 1
- name: Mount glusterfs to retrieve disk size
ansible.posix.mount:
name: "{{ gluster_mount_dir }}"
src: "{{ ip | default(ansible_default_ipv4['address']) }}:/gluster"
fstype: glusterfs
opts: "defaults,_netdev"
state: mounted
when: groups['gfs-cluster'] is defined and inventory_hostname == groups['gfs-cluster'][0]
- name: Get Gluster disk size
setup:
filter: ansible_mounts
register: mounts_data
when: groups['gfs-cluster'] is defined and inventory_hostname == groups['gfs-cluster'][0]
- name: Set Gluster disk size to variable
set_fact:
gluster_disk_size_gb: "{{ (mounts_data.ansible_facts.ansible_mounts | selectattr('mount', 'equalto', gluster_mount_dir) | map(attribute='size_total') | first | int / (1024 * 1024 * 1024)) | int }}"
when: groups['gfs-cluster'] is defined and inventory_hostname == groups['gfs-cluster'][0]
- name: Create file on GlusterFS
template:
dest: "{{ gluster_mount_dir }}/.test-file.txt"
src: test-file.txt
mode: "0644"
when: groups['gfs-cluster'] is defined and inventory_hostname == groups['gfs-cluster'][0]
- name: Unmount glusterfs
ansible.posix.mount:
name: "{{ gluster_mount_dir }}"
fstype: glusterfs
src: "{{ ip | default(ansible_default_ipv4['address']) }}:/gluster"
state: unmounted
when: groups['gfs-cluster'] is defined and inventory_hostname == groups['gfs-cluster'][0]

View File

@@ -1,26 +0,0 @@
---
- name: Add PPA for GlusterFS.
apt_repository:
repo: 'ppa:gluster/glusterfs-{{ glusterfs_ppa_version }}'
state: present
update_cache: true
register: glusterfs_ppa_added
when: glusterfs_ppa_use
- name: Ensure GlusterFS will reinstall if the PPA was just added. # noqa no-handler
apt:
name: "{{ item }}"
state: absent
with_items:
- glusterfs-server
- glusterfs-client
when: glusterfs_ppa_added.changed
- name: Ensure GlusterFS is installed.
apt:
name: "{{ item }}"
state: present
default_release: "{{ glusterfs_default_release }}"
with_items:
- glusterfs-server
- glusterfs-client

View File

@@ -1,15 +0,0 @@
---
- name: Install Prerequisites
package:
name: "{{ item }}"
state: present
with_items:
- "centos-release-gluster{{ glusterfs_default_release }}"
- name: Install Packages
package:
name: "{{ item }}"
state: present
with_items:
- glusterfs-server
- glusterfs-client

View File

@@ -1,2 +0,0 @@
---
glusterfs_daemon: glusterd

View File

@@ -1,2 +0,0 @@
---
glusterfs_daemon: glusterd

View File

@@ -1,23 +0,0 @@
---
- name: Kubernetes Apps | Lay Down k8s GlusterFS Endpoint and PV
template:
src: "{{ item.file }}"
dest: "{{ kube_config_dir }}/{{ item.dest }}"
mode: "0644"
with_items:
- { file: glusterfs-kubernetes-endpoint.json.j2, type: ep, dest: glusterfs-kubernetes-endpoint.json}
- { file: glusterfs-kubernetes-pv.yml.j2, type: pv, dest: glusterfs-kubernetes-pv.yml}
- { file: glusterfs-kubernetes-endpoint-svc.json.j2, type: svc, dest: glusterfs-kubernetes-endpoint-svc.json}
register: gluster_pv
when: inventory_hostname == groups['kube_control_plane'][0] and groups['gfs-cluster'] is defined and hostvars[groups['gfs-cluster'][0]].gluster_disk_size_gb is defined
- name: Kubernetes Apps | Set GlusterFS endpoint and PV
kube:
name: glusterfs
namespace: default
kubectl: "{{ bin_dir }}/kubectl"
resource: "{{ item.item.type }}"
filename: "{{ kube_config_dir }}/{{ item.item.dest }}"
state: "{{ item.changed | ternary('latest', 'present') }}"
with_items: "{{ gluster_pv.results }}"
when: inventory_hostname == groups['kube_control_plane'][0] and groups['gfs-cluster'] is defined

View File

@@ -1,12 +0,0 @@
{
"kind": "Service",
"apiVersion": "v1",
"metadata": {
"name": "glusterfs"
},
"spec": {
"ports": [
{"port": 1}
]
}
}

View File

@@ -1,23 +0,0 @@
{
"kind": "Endpoints",
"apiVersion": "v1",
"metadata": {
"name": "glusterfs"
},
"subsets": [
{% for host in groups['gfs-cluster'] %}
{
"addresses": [
{
"ip": "{{hostvars[host]['ip']|default(hostvars[host].ansible_default_ipv4['address'])}}"
}
],
"ports": [
{
"port": 1
}
]
}{%- if not loop.last %}, {% endif -%}
{% endfor %}
]
}

View File

@@ -1,14 +0,0 @@
apiVersion: v1
kind: PersistentVolume
metadata:
name: glusterfs
spec:
capacity:
storage: "{{ hostvars[groups['gfs-cluster'][0]].gluster_disk_size_gb }}Gi"
accessModes:
- ReadWriteMany
glusterfs:
endpoints: glusterfs
path: gluster
readOnly: false
persistentVolumeReclaimPolicy: Retain

View File

@@ -1,3 +0,0 @@
---
dependencies:
- {role: kubernetes-pv/ansible, tags: apps}

View File

@@ -1,27 +0,0 @@
# Deploy Heketi/Glusterfs into Kubespray/Kubernetes
This playbook aims to automate [this](https://github.com/heketi/heketi/blob/master/docs/admin/install-kubernetes.md) tutorial. It deploys heketi/glusterfs into kubernetes and sets up a storageclass.
## Important notice
> Due to resource limits on the current project maintainers and general lack of contributions we are considering placing Heketi into a [near-maintenance mode](https://github.com/heketi/heketi#important-notice)
## Client Setup
Heketi provides a CLI that provides users with a means to administer the deployment and configuration of GlusterFS in Kubernetes. [Download and install the heketi-cli](https://github.com/heketi/heketi/releases) on your client machine.
## Install
Copy the inventory.yml.sample over to inventory/sample/k8s_heketi_inventory.yml and change it according to your setup.
```shell
ansible-playbook --ask-become -i inventory/sample/k8s_heketi_inventory.yml contrib/network-storage/heketi/heketi.yml
```
## Tear down
```shell
ansible-playbook --ask-become -i inventory/sample/k8s_heketi_inventory.yml contrib/network-storage/heketi/heketi-tear-down.yml
```
Add `--extra-vars "heketi_remove_lvm=true"` to the command above to remove LVM packages from the system

View File

@@ -1,11 +0,0 @@
---
- name: Tear down heketi
hosts: kube_control_plane[0]
roles:
- { role: tear-down }
- name: Teardown disks in heketi
hosts: heketi-node
become: true
roles:
- { role: tear-down-disks }

View File

@@ -1,12 +0,0 @@
---
- name: Prepare heketi install
hosts: heketi-node
roles:
- { role: prepare }
- name: Provision heketi
hosts: kube_control_plane[0]
tags:
- "provision"
roles:
- { role: provision }

View File

@@ -1,33 +0,0 @@
all:
vars:
heketi_admin_key: "11elfeinhundertundelf"
heketi_user_key: "!!einseinseins"
glusterfs_daemonset:
readiness_probe:
timeout_seconds: 3
initial_delay_seconds: 3
liveness_probe:
timeout_seconds: 3
initial_delay_seconds: 10
children:
k8s_cluster:
vars:
kubelet_fail_swap_on: false
children:
kube_control_plane:
hosts:
node1:
etcd:
hosts:
node2:
kube_node:
hosts: &kube_nodes
node1:
node2:
node3:
node4:
heketi-node:
vars:
disk_volume_device_1: "/dev/vdb"
hosts:
<<: *kube_nodes

View File

@@ -1 +0,0 @@
jmespath

View File

@@ -1,24 +0,0 @@
---
- name: "Load lvm kernel modules"
become: true
with_items:
- "dm_snapshot"
- "dm_mirror"
- "dm_thin_pool"
community.general.modprobe:
name: "{{ item }}"
state: "present"
- name: "Install glusterfs mount utils (RedHat)"
become: true
package:
name: "glusterfs-fuse"
state: "present"
when: "ansible_os_family == 'RedHat'"
- name: "Install glusterfs mount utils (Debian)"
become: true
apt:
name: "glusterfs-client"
state: "present"
when: "ansible_os_family == 'Debian'"

View File

@@ -1,3 +0,0 @@
---
- name: "Stop port forwarding"
command: "killall "

View File

@@ -1,64 +0,0 @@
---
# Bootstrap heketi
- name: "Get state of heketi service, deployment and pods."
register: "initial_heketi_state"
changed_when: false
command: "{{ bin_dir }}/kubectl get services,deployments,pods --selector=deploy-heketi --output=json"
- name: "Bootstrap heketi."
when:
- "(initial_heketi_state.stdout | from_json | json_query(\"items[?kind=='Service']\")) | length == 0"
- "(initial_heketi_state.stdout | from_json | json_query(\"items[?kind=='Deployment']\")) | length == 0"
- "(initial_heketi_state.stdout | from_json | json_query(\"items[?kind=='Pod']\")) | length == 0"
include_tasks: "bootstrap/deploy.yml"
# Prepare heketi topology
- name: "Get heketi initial pod state."
register: "initial_heketi_pod"
command: "{{ bin_dir }}/kubectl get pods --selector=deploy-heketi=pod,glusterfs=heketi-pod,name=deploy-heketi --output=json"
changed_when: false
- name: "Ensure heketi bootstrap pod is up."
assert:
that: "(initial_heketi_pod.stdout | from_json | json_query('items[*]')) | length == 1"
- name: Store the initial heketi pod name
set_fact:
initial_heketi_pod_name: "{{ initial_heketi_pod.stdout | from_json | json_query(\"items[*].metadata.name | [0]\") }}"
- name: "Test heketi topology."
changed_when: false
register: "heketi_topology"
command: "{{ bin_dir }}/kubectl exec {{ initial_heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} topology info --json"
- name: "Load heketi topology."
when: "heketi_topology.stdout | from_json | json_query(\"clusters[*].nodes[*]\") | flatten | length == 0"
include_tasks: "bootstrap/topology.yml"
# Provision heketi database volume
- name: "Prepare heketi volumes."
include_tasks: "bootstrap/volumes.yml"
# Remove bootstrap heketi
- name: "Tear down bootstrap."
include_tasks: "bootstrap/tear-down.yml"
# Prepare heketi storage
- name: "Test heketi storage."
command: "{{ bin_dir }}/kubectl get secrets,endpoints,services,jobs --output=json"
changed_when: false
register: "heketi_storage_state"
# ensure endpoints actually exist before trying to move database data to it
- name: "Create heketi storage."
include_tasks: "bootstrap/storage.yml"
vars:
secret_query: "items[?metadata.name=='heketi-storage-secret' && kind=='Secret']"
endpoints_query: "items[?metadata.name=='heketi-storage-endpoints' && kind=='Endpoints']"
service_query: "items[?metadata.name=='heketi-storage-endpoints' && kind=='Service']"
job_query: "items[?metadata.name=='heketi-storage-copy-job' && kind=='Job']"
when:
- "heketi_storage_state.stdout | from_json | json_query(secret_query) | length == 0"
- "heketi_storage_state.stdout | from_json | json_query(endpoints_query) | length == 0"
- "heketi_storage_state.stdout | from_json | json_query(service_query) | length == 0"
- "heketi_storage_state.stdout | from_json | json_query(job_query) | length == 0"

View File

@@ -1,27 +0,0 @@
---
- name: "Kubernetes Apps | Lay Down Heketi Bootstrap"
become: true
template:
src: "heketi-bootstrap.json.j2"
dest: "{{ kube_config_dir }}/heketi-bootstrap.json"
mode: "0640"
register: "rendering"
- name: "Kubernetes Apps | Install and configure Heketi Bootstrap"
kube:
name: "GlusterFS"
kubectl: "{{ bin_dir }}/kubectl"
filename: "{{ kube_config_dir }}/heketi-bootstrap.json"
state: "{{ rendering.changed | ternary('latest', 'present') }}"
- name: "Wait for heketi bootstrap to complete."
changed_when: false
register: "initial_heketi_state"
vars:
initial_heketi_state: { stdout: "{}" }
pods_query: "items[?kind=='Pod'].status.conditions | [0][?type=='Ready'].status | [0]"
deployments_query: "items[?kind=='Deployment'].status.conditions | [0][?type=='Available'].status | [0]"
command: "{{ bin_dir }}/kubectl get services,deployments,pods --selector=deploy-heketi --output=json"
until:
- "initial_heketi_state.stdout | from_json | json_query(pods_query) == 'True'"
- "initial_heketi_state.stdout | from_json | json_query(deployments_query) == 'True'"
retries: 60
delay: 5

View File

@@ -1,33 +0,0 @@
---
- name: "Test heketi storage."
command: "{{ bin_dir }}/kubectl get secrets,endpoints,services,jobs --output=json"
changed_when: false
register: "heketi_storage_state"
- name: "Create heketi storage."
kube:
name: "GlusterFS"
kubectl: "{{ bin_dir }}/kubectl"
filename: "{{ kube_config_dir }}/heketi-storage-bootstrap.json"
state: "present"
vars:
secret_query: "items[?metadata.name=='heketi-storage-secret' && kind=='Secret']"
endpoints_query: "items[?metadata.name=='heketi-storage-endpoints' && kind=='Endpoints']"
service_query: "items[?metadata.name=='heketi-storage-endpoints' && kind=='Service']"
job_query: "items[?metadata.name=='heketi-storage-copy-job' && kind=='Job']"
when:
- "heketi_storage_state.stdout | from_json | json_query(secret_query) | length == 0"
- "heketi_storage_state.stdout | from_json | json_query(endpoints_query) | length == 0"
- "heketi_storage_state.stdout | from_json | json_query(service_query) | length == 0"
- "heketi_storage_state.stdout | from_json | json_query(job_query) | length == 0"
register: "heketi_storage_result"
- name: "Get state of heketi database copy job."
command: "{{ bin_dir }}/kubectl get jobs --output=json"
changed_when: false
register: "heketi_storage_state"
vars:
heketi_storage_state: { stdout: "{}" }
job_query: "items[?metadata.name=='heketi-storage-copy-job' && kind=='Job' && status.succeeded==1]"
until:
- "heketi_storage_state.stdout | from_json | json_query(job_query) | length == 1"
retries: 60
delay: 5

View File

@@ -1,14 +0,0 @@
---
- name: "Get existing Heketi deploy resources."
command: "{{ bin_dir }}/kubectl get all --selector=\"deploy-heketi\" -o=json"
register: "heketi_resources"
changed_when: false
- name: "Delete bootstrap Heketi."
command: "{{ bin_dir }}/kubectl delete all,service,jobs,deployment,secret --selector=\"deploy-heketi\""
when: "heketi_resources.stdout | from_json | json_query('items[*]') | length > 0"
- name: "Ensure there is nothing left over."
command: "{{ bin_dir }}/kubectl get all,service,jobs,deployment,secret --selector=\"deploy-heketi\" -o=json"
register: "heketi_result"
until: "heketi_result.stdout | from_json | json_query('items[*]') | length == 0"
retries: 60
delay: 5

View File

@@ -1,27 +0,0 @@
---
- name: "Get heketi topology."
changed_when: false
register: "heketi_topology"
command: "{{ bin_dir }}/kubectl exec {{ initial_heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} topology info --json"
- name: "Render heketi topology template."
become: true
vars: { nodes: "{{ groups['heketi-node'] }}" }
register: "render"
template:
src: "topology.json.j2"
dest: "{{ kube_config_dir }}/topology.json"
mode: "0644"
- name: "Copy topology configuration into container."
changed_when: false
command: "{{ bin_dir }}/kubectl cp {{ kube_config_dir }}/topology.json {{ initial_heketi_pod_name }}:/tmp/topology.json"
- name: "Load heketi topology." # noqa no-handler
when: "render.changed"
command: "{{ bin_dir }}/kubectl exec {{ initial_heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} topology load --json=/tmp/topology.json"
register: "load_heketi"
- name: "Get heketi topology."
changed_when: false
register: "heketi_topology"
command: "{{ bin_dir }}/kubectl exec {{ initial_heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} topology info --json"
until: "heketi_topology.stdout | from_json | json_query(\"clusters[*].nodes[*].devices[?state=='online'].id\") | flatten | length == groups['heketi-node'] | length"
retries: 60
delay: 5

View File

@@ -1,41 +0,0 @@
---
- name: "Get heketi volume ids."
command: "{{ bin_dir }}/kubectl exec {{ initial_heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} volume list --json"
changed_when: false
register: "heketi_volumes"
- name: "Get heketi volumes."
changed_when: false
command: "{{ bin_dir }}/kubectl exec {{ initial_heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} volume info {{ volume_id }} --json"
with_items: "{{ heketi_volumes.stdout | from_json | json_query(\"volumes[*]\") }}"
loop_control: { loop_var: "volume_id" }
register: "volumes_information"
- name: "Test heketi database volume."
set_fact: { heketi_database_volume_exists: true }
with_items: "{{ volumes_information.results }}"
loop_control: { loop_var: "volume_information" }
vars: { volume: "{{ volume_information.stdout | from_json }}" }
when: "volume.name == 'heketidbstorage'"
- name: "Provision database volume."
command: "{{ bin_dir }}/kubectl exec {{ initial_heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} setup-openshift-heketi-storage"
when: "heketi_database_volume_exists is undefined"
- name: "Copy configuration from pod."
become: true
command: "{{ bin_dir }}/kubectl cp {{ initial_heketi_pod_name }}:/heketi-storage.json {{ kube_config_dir }}/heketi-storage-bootstrap.json"
- name: "Get heketi volume ids."
command: "{{ bin_dir }}/kubectl exec {{ initial_heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} volume list --json"
changed_when: false
register: "heketi_volumes"
- name: "Get heketi volumes."
changed_when: false
command: "{{ bin_dir }}/kubectl exec {{ initial_heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} volume info {{ volume_id }} --json"
with_items: "{{ heketi_volumes.stdout | from_json | json_query(\"volumes[*]\") }}"
loop_control: { loop_var: "volume_id" }
register: "volumes_information"
- name: "Test heketi database volume."
set_fact: { heketi_database_volume_created: true }
with_items: "{{ volumes_information.results }}"
loop_control: { loop_var: "volume_information" }
vars: { volume: "{{ volume_information.stdout | from_json }}" }
when: "volume.name == 'heketidbstorage'"
- name: "Ensure heketi database volume exists."
assert: { that: "heketi_database_volume_created is defined", msg: "Heketi database volume does not exist." }

View File

@@ -1,4 +0,0 @@
---
- name: "Clean up left over jobs."
command: "{{ bin_dir }}/kubectl delete jobs,pods --selector=\"deploy-heketi\""
changed_when: false

View File

@@ -1,44 +0,0 @@
---
- name: "Kubernetes Apps | Lay Down GlusterFS Daemonset"
template:
src: "glusterfs-daemonset.json.j2"
dest: "{{ kube_config_dir }}/glusterfs-daemonset.json"
mode: "0644"
become: true
register: "rendering"
- name: "Kubernetes Apps | Install and configure GlusterFS daemonset"
kube:
name: "GlusterFS"
kubectl: "{{ bin_dir }}/kubectl"
filename: "{{ kube_config_dir }}/glusterfs-daemonset.json"
state: "{{ rendering.changed | ternary('latest', 'present') }}"
- name: "Kubernetes Apps | Label GlusterFS nodes"
include_tasks: "glusterfs/label.yml"
with_items: "{{ groups['heketi-node'] }}"
loop_control:
loop_var: "node"
- name: "Kubernetes Apps | Wait for daemonset to become available."
register: "daemonset_state"
command: "{{ bin_dir }}/kubectl get daemonset glusterfs --output=json --ignore-not-found=true"
changed_when: false
vars:
daemonset_state: { stdout: "{}" }
ready: "{{ daemonset_state.stdout | from_json | json_query(\"status.numberReady\") }}"
desired: "{{ daemonset_state.stdout | from_json | json_query(\"status.desiredNumberScheduled\") }}"
until: "ready | int >= 3"
retries: 60
delay: 5
- name: "Kubernetes Apps | Lay Down Heketi Service Account"
template:
src: "heketi-service-account.json.j2"
dest: "{{ kube_config_dir }}/heketi-service-account.json"
mode: "0644"
become: true
register: "rendering"
- name: "Kubernetes Apps | Install and configure Heketi Service Account"
kube:
name: "GlusterFS"
kubectl: "{{ bin_dir }}/kubectl"
filename: "{{ kube_config_dir }}/heketi-service-account.json"
state: "{{ rendering.changed | ternary('latest', 'present') }}"

View File

@@ -1,19 +0,0 @@
---
- name: Get storage nodes
register: "label_present"
command: "{{ bin_dir }}/kubectl get node --selector=storagenode=glusterfs,kubernetes.io/hostname={{ node }} --ignore-not-found=true"
changed_when: false
- name: "Assign storage label"
when: "label_present.stdout_lines | length == 0"
command: "{{ bin_dir }}/kubectl label node {{ node }} storagenode=glusterfs"
- name: Get storage nodes again
register: "label_present"
command: "{{ bin_dir }}/kubectl get node --selector=storagenode=glusterfs,kubernetes.io/hostname={{ node }} --ignore-not-found=true"
changed_when: false
- name: Ensure the label has been set
assert:
that: "label_present | length > 0"
msg: "Node {{ node }} has not been assigned with label storagenode=glusterfs."

View File

@@ -1,34 +0,0 @@
---
- name: "Kubernetes Apps | Lay Down Heketi"
become: true
template:
src: "heketi-deployment.json.j2"
dest: "{{ kube_config_dir }}/heketi-deployment.json"
mode: "0644"
register: "rendering"
- name: "Kubernetes Apps | Install and configure Heketi"
kube:
name: "GlusterFS"
kubectl: "{{ bin_dir }}/kubectl"
filename: "{{ kube_config_dir }}/heketi-deployment.json"
state: "{{ rendering.changed | ternary('latest', 'present') }}"
- name: "Ensure heketi is up and running."
changed_when: false
register: "heketi_state"
vars:
heketi_state:
stdout: "{}"
pods_query: "items[?kind=='Pod'].status.conditions|[0][?type=='Ready'].status|[0]"
deployments_query: "items[?kind=='Deployment'].status.conditions|[0][?type=='Available'].status|[0]"
command: "{{ bin_dir }}/kubectl get deployments,pods --selector=glusterfs --output=json"
until:
- "heketi_state.stdout | from_json | json_query(pods_query) == 'True'"
- "heketi_state.stdout | from_json | json_query(deployments_query) == 'True'"
retries: 60
delay: 5
- name: Set the Heketi pod name
set_fact:
heketi_pod_name: "{{ heketi_state.stdout | from_json | json_query(\"items[?kind=='Pod'].metadata.name|[0]\") }}"

View File

@@ -1,30 +0,0 @@
---
- name: "Kubernetes Apps | GlusterFS"
include_tasks: "glusterfs.yml"
- name: "Kubernetes Apps | Heketi Secrets"
include_tasks: "secret.yml"
- name: "Kubernetes Apps | Test Heketi"
register: "heketi_service_state"
command: "{{ bin_dir }}/kubectl get service heketi-storage-endpoints -o=name --ignore-not-found=true"
changed_when: false
- name: "Kubernetes Apps | Bootstrap Heketi"
when: "heketi_service_state.stdout == \"\""
include_tasks: "bootstrap.yml"
- name: "Kubernetes Apps | Heketi"
include_tasks: "heketi.yml"
- name: "Kubernetes Apps | Heketi Topology"
include_tasks: "topology.yml"
- name: "Kubernetes Apps | Heketi Storage"
include_tasks: "storage.yml"
- name: "Kubernetes Apps | Storage Class"
include_tasks: "storageclass.yml"
- name: "Clean up"
include_tasks: "cleanup.yml"

View File

@@ -1,45 +0,0 @@
---
- name: Get clusterrolebindings
register: "clusterrolebinding_state"
command: "{{ bin_dir }}/kubectl get clusterrolebinding heketi-gluster-admin -o=name --ignore-not-found=true"
changed_when: false
- name: "Kubernetes Apps | Deploy cluster role binding."
when: "clusterrolebinding_state.stdout | length == 0"
command: "{{ bin_dir }}/kubectl create clusterrolebinding heketi-gluster-admin --clusterrole=edit --serviceaccount=default:heketi-service-account"
- name: Get clusterrolebindings again
register: "clusterrolebinding_state"
command: "{{ bin_dir }}/kubectl get clusterrolebinding heketi-gluster-admin -o=name --ignore-not-found=true"
changed_when: false
- name: Make sure that clusterrolebindings are present now
assert:
that: "clusterrolebinding_state.stdout | length > 0"
msg: "Cluster role binding is not present."
- name: Get the heketi-config-secret secret
register: "secret_state"
command: "{{ bin_dir }}/kubectl get secret heketi-config-secret -o=name --ignore-not-found=true"
changed_when: false
- name: "Render Heketi secret configuration."
become: true
template:
src: "heketi.json.j2"
dest: "{{ kube_config_dir }}/heketi.json"
mode: "0644"
- name: "Deploy Heketi config secret"
when: "secret_state.stdout | length == 0"
command: "{{ bin_dir }}/kubectl create secret generic heketi-config-secret --from-file={{ kube_config_dir }}/heketi.json"
- name: Get the heketi-config-secret secret again
register: "secret_state"
command: "{{ bin_dir }}/kubectl get secret heketi-config-secret -o=name --ignore-not-found=true"
changed_when: false
- name: Make sure the heketi-config-secret secret exists now
assert:
that: "secret_state.stdout | length > 0"
msg: "Heketi config secret is not present."

View File

@@ -1,15 +0,0 @@
---
- name: "Kubernetes Apps | Lay Down Heketi Storage"
become: true
vars: { nodes: "{{ groups['heketi-node'] }}" }
template:
src: "heketi-storage.json.j2"
dest: "{{ kube_config_dir }}/heketi-storage.json"
mode: "0644"
register: "rendering"
- name: "Kubernetes Apps | Install and configure Heketi Storage"
kube:
name: "GlusterFS"
kubectl: "{{ bin_dir }}/kubectl"
filename: "{{ kube_config_dir }}/heketi-storage.json"
state: "{{ rendering.changed | ternary('latest', 'present') }}"

View File

@@ -1,26 +0,0 @@
---
- name: "Test storage class."
command: "{{ bin_dir }}/kubectl get storageclass gluster --ignore-not-found=true --output=json"
register: "storageclass"
changed_when: false
- name: "Test heketi service."
command: "{{ bin_dir }}/kubectl get service heketi --ignore-not-found=true --output=json"
register: "heketi_service"
changed_when: false
- name: "Ensure heketi service is available."
assert: { that: "heketi_service.stdout != \"\"" }
- name: "Render storage class configuration."
become: true
vars:
endpoint_address: "{{ (heketi_service.stdout | from_json).spec.clusterIP }}"
template:
src: "storageclass.yml.j2"
dest: "{{ kube_config_dir }}/storageclass.yml"
mode: "0644"
register: "rendering"
- name: "Kubernetes Apps | Install and configure Storace Class"
kube:
name: "GlusterFS"
kubectl: "{{ bin_dir }}/kubectl"
filename: "{{ kube_config_dir }}/storageclass.yml"
state: "{{ rendering.changed | ternary('latest', 'present') }}"

View File

@@ -1,26 +0,0 @@
---
- name: "Get heketi topology."
register: "heketi_topology"
changed_when: false
command: "{{ bin_dir }}/kubectl exec {{ heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} topology info --json"
- name: "Render heketi topology template."
become: true
vars: { nodes: "{{ groups['heketi-node'] }}" }
register: "rendering"
template:
src: "topology.json.j2"
dest: "{{ kube_config_dir }}/topology.json"
mode: "0644"
- name: "Copy topology configuration into container." # noqa no-handler
when: "rendering.changed"
command: "{{ bin_dir }}/kubectl cp {{ kube_config_dir }}/topology.json {{ heketi_pod_name }}:/tmp/topology.json"
- name: "Load heketi topology." # noqa no-handler
when: "rendering.changed"
command: "{{ bin_dir }}/kubectl exec {{ heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} topology load --json=/tmp/topology.json"
- name: "Get heketi topology."
register: "heketi_topology"
changed_when: false
command: "{{ bin_dir }}/kubectl exec {{ heketi_pod_name }} -- heketi-cli --user admin --secret {{ heketi_admin_key }} topology info --json"
until: "heketi_topology.stdout | from_json | json_query(\"clusters[*].nodes[*].devices[?state=='online'].id\") | flatten | length == groups['heketi-node'] | length"
retries: 60
delay: 5

View File

@@ -1,149 +0,0 @@
{
"kind": "DaemonSet",
"apiVersion": "apps/v1",
"metadata": {
"name": "glusterfs",
"labels": {
"glusterfs": "deployment"
},
"annotations": {
"description": "GlusterFS Daemon Set",
"tags": "glusterfs"
}
},
"spec": {
"selector": {
"matchLabels": {
"glusterfs-node": "daemonset"
}
},
"template": {
"metadata": {
"name": "glusterfs",
"labels": {
"glusterfs-node": "daemonset"
}
},
"spec": {
"nodeSelector": {
"storagenode" : "glusterfs"
},
"hostNetwork": true,
"containers": [
{
"image": "gluster/gluster-centos:gluster4u0_centos7",
"imagePullPolicy": "IfNotPresent",
"name": "glusterfs",
"volumeMounts": [
{
"name": "glusterfs-heketi",
"mountPath": "/var/lib/heketi"
},
{
"name": "glusterfs-run",
"mountPath": "/run"
},
{
"name": "glusterfs-lvm",
"mountPath": "/run/lvm"
},
{
"name": "glusterfs-etc",
"mountPath": "/etc/glusterfs"
},
{
"name": "glusterfs-logs",
"mountPath": "/var/log/glusterfs"
},
{
"name": "glusterfs-config",
"mountPath": "/var/lib/glusterd"
},
{
"name": "glusterfs-dev",
"mountPath": "/dev"
},
{
"name": "glusterfs-cgroup",
"mountPath": "/sys/fs/cgroup"
}
],
"securityContext": {
"capabilities": {},
"privileged": true
},
"readinessProbe": {
"timeoutSeconds": {{ glusterfs_daemonset.readiness_probe.timeout_seconds }},
"initialDelaySeconds": {{ glusterfs_daemonset.readiness_probe.initial_delay_seconds }},
"exec": {
"command": [
"/bin/bash",
"-c",
"systemctl status glusterd.service"
]
}
},
"livenessProbe": {
"timeoutSeconds": {{ glusterfs_daemonset.liveness_probe.timeout_seconds }},
"initialDelaySeconds": {{ glusterfs_daemonset.liveness_probe.initial_delay_seconds }},
"exec": {
"command": [
"/bin/bash",
"-c",
"systemctl status glusterd.service"
]
}
}
}
],
"volumes": [
{
"name": "glusterfs-heketi",
"hostPath": {
"path": "/var/lib/heketi"
}
},
{
"name": "glusterfs-run"
},
{
"name": "glusterfs-lvm",
"hostPath": {
"path": "/run/lvm"
}
},
{
"name": "glusterfs-etc",
"hostPath": {
"path": "/etc/glusterfs"
}
},
{
"name": "glusterfs-logs",
"hostPath": {
"path": "/var/log/glusterfs"
}
},
{
"name": "glusterfs-config",
"hostPath": {
"path": "/var/lib/glusterd"
}
},
{
"name": "glusterfs-dev",
"hostPath": {
"path": "/dev"
}
},
{
"name": "glusterfs-cgroup",
"hostPath": {
"path": "/sys/fs/cgroup"
}
}
]
}
}
}
}

View File

@@ -1,138 +0,0 @@
{
"kind": "List",
"apiVersion": "v1",
"items": [
{
"kind": "Service",
"apiVersion": "v1",
"metadata": {
"name": "deploy-heketi",
"labels": {
"glusterfs": "heketi-service",
"deploy-heketi": "support"
},
"annotations": {
"description": "Exposes Heketi Service"
}
},
"spec": {
"selector": {
"name": "deploy-heketi"
},
"ports": [
{
"name": "deploy-heketi",
"port": 8080,
"targetPort": 8080
}
]
}
},
{
"kind": "Deployment",
"apiVersion": "apps/v1",
"metadata": {
"name": "deploy-heketi",
"labels": {
"glusterfs": "heketi-deployment",
"deploy-heketi": "deployment"
},
"annotations": {
"description": "Defines how to deploy Heketi"
}
},
"spec": {
"selector": {
"matchLabels": {
"name": "deploy-heketi"
}
},
"replicas": 1,
"template": {
"metadata": {
"name": "deploy-heketi",
"labels": {
"name": "deploy-heketi",
"glusterfs": "heketi-pod",
"deploy-heketi": "pod"
}
},
"spec": {
"serviceAccountName": "heketi-service-account",
"containers": [
{
"image": "heketi/heketi:9",
"imagePullPolicy": "Always",
"name": "deploy-heketi",
"env": [
{
"name": "HEKETI_EXECUTOR",
"value": "kubernetes"
},
{
"name": "HEKETI_DB_PATH",
"value": "/var/lib/heketi/heketi.db"
},
{
"name": "HEKETI_FSTAB",
"value": "/var/lib/heketi/fstab"
},
{
"name": "HEKETI_SNAPSHOT_LIMIT",
"value": "14"
},
{
"name": "HEKETI_KUBE_GLUSTER_DAEMONSET",
"value": "y"
}
],
"ports": [
{
"containerPort": 8080
}
],
"volumeMounts": [
{
"name": "db",
"mountPath": "/var/lib/heketi"
},
{
"name": "config",
"mountPath": "/etc/heketi"
}
],
"readinessProbe": {
"timeoutSeconds": 3,
"initialDelaySeconds": 3,
"httpGet": {
"path": "/hello",
"port": 8080
}
},
"livenessProbe": {
"timeoutSeconds": 3,
"initialDelaySeconds": 10,
"httpGet": {
"path": "/hello",
"port": 8080
}
}
}
],
"volumes": [
{
"name": "db"
},
{
"name": "config",
"secret": {
"secretName": "heketi-config-secret"
}
}
]
}
}
}
}
]
}

View File

@@ -1,164 +0,0 @@
{
"kind": "List",
"apiVersion": "v1",
"items": [
{
"kind": "Secret",
"apiVersion": "v1",
"metadata": {
"name": "heketi-db-backup",
"labels": {
"glusterfs": "heketi-db",
"heketi": "db"
}
},
"data": {
},
"type": "Opaque"
},
{
"kind": "Service",
"apiVersion": "v1",
"metadata": {
"name": "heketi",
"labels": {
"glusterfs": "heketi-service",
"deploy-heketi": "support"
},
"annotations": {
"description": "Exposes Heketi Service"
}
},
"spec": {
"selector": {
"name": "heketi"
},
"ports": [
{
"name": "heketi",
"port": 8080,
"targetPort": 8080
}
]
}
},
{
"kind": "Deployment",
"apiVersion": "apps/v1",
"metadata": {
"name": "heketi",
"labels": {
"glusterfs": "heketi-deployment"
},
"annotations": {
"description": "Defines how to deploy Heketi"
}
},
"spec": {
"selector": {
"matchLabels": {
"name": "heketi"
}
},
"replicas": 1,
"template": {
"metadata": {
"name": "heketi",
"labels": {
"name": "heketi",
"glusterfs": "heketi-pod"
}
},
"spec": {
"serviceAccountName": "heketi-service-account",
"containers": [
{
"image": "heketi/heketi:9",
"imagePullPolicy": "Always",
"name": "heketi",
"env": [
{
"name": "HEKETI_EXECUTOR",
"value": "kubernetes"
},
{
"name": "HEKETI_DB_PATH",
"value": "/var/lib/heketi/heketi.db"
},
{
"name": "HEKETI_FSTAB",
"value": "/var/lib/heketi/fstab"
},
{
"name": "HEKETI_SNAPSHOT_LIMIT",
"value": "14"
},
{
"name": "HEKETI_KUBE_GLUSTER_DAEMONSET",
"value": "y"
}
],
"ports": [
{
"containerPort": 8080
}
],
"volumeMounts": [
{
"mountPath": "/backupdb",
"name": "heketi-db-secret"
},
{
"name": "db",
"mountPath": "/var/lib/heketi"
},
{
"name": "config",
"mountPath": "/etc/heketi"
}
],
"readinessProbe": {
"timeoutSeconds": 3,
"initialDelaySeconds": 3,
"httpGet": {
"path": "/hello",
"port": 8080
}
},
"livenessProbe": {
"timeoutSeconds": 3,
"initialDelaySeconds": 10,
"httpGet": {
"path": "/hello",
"port": 8080
}
}
}
],
"volumes": [
{
"name": "db",
"glusterfs": {
"endpoints": "heketi-storage-endpoints",
"path": "heketidbstorage"
}
},
{
"name": "heketi-db-secret",
"secret": {
"secretName": "heketi-db-backup"
}
},
{
"name": "config",
"secret": {
"secretName": "heketi-config-secret"
}
}
]
}
}
}
}
]
}

View File

@@ -1,7 +0,0 @@
{
"apiVersion": "v1",
"kind": "ServiceAccount",
"metadata": {
"name": "heketi-service-account"
}
}

View File

@@ -1,54 +0,0 @@
{
"apiVersion": "v1",
"kind": "List",
"items": [
{
"kind": "Endpoints",
"apiVersion": "v1",
"metadata": {
"name": "heketi-storage-endpoints",
"creationTimestamp": null
},
"subsets": [
{% set nodeblocks = [] %}
{% for node in nodes %}
{% set nodeblock %}
{
"addresses": [
{
"ip": "{{ hostvars[node].ip }}"
}
],
"ports": [
{
"port": 1
}
]
}
{% endset %}
{% if nodeblocks.append(nodeblock) %}{% endif %}
{% endfor %}
{{ nodeblocks|join(',') }}
]
},
{
"kind": "Service",
"apiVersion": "v1",
"metadata": {
"name": "heketi-storage-endpoints",
"creationTimestamp": null
},
"spec": {
"ports": [
{
"port": 1,
"targetPort": 0
}
]
},
"status": {
"loadBalancer": {}
}
}
]
}

View File

@@ -1,44 +0,0 @@
{
"_port_comment": "Heketi Server Port Number",
"port": "8080",
"_use_auth": "Enable JWT authorization. Please enable for deployment",
"use_auth": true,
"_jwt": "Private keys for access",
"jwt": {
"_admin": "Admin has access to all APIs",
"admin": {
"key": "{{ heketi_admin_key }}"
},
"_user": "User only has access to /volumes endpoint",
"user": {
"key": "{{ heketi_user_key }}"
}
},
"_glusterfs_comment": "GlusterFS Configuration",
"glusterfs": {
"_executor_comment": "Execute plugin. Possible choices: mock, kubernetes, ssh",
"executor": "kubernetes",
"_db_comment": "Database file name",
"db": "/var/lib/heketi/heketi.db",
"kubeexec": {
"rebalance_on_expansion": true
},
"sshexec": {
"rebalance_on_expansion": true,
"keyfile": "/etc/heketi/private_key",
"fstab": "/etc/fstab",
"port": "22",
"user": "root",
"sudo": false
}
},
"_backup_db_to_kube_secret": "Backup the heketi database to a Kubernetes secret when running in Kubernetes. Default is off.",
"backup_db_to_kube_secret": false
}

View File

@@ -1,12 +0,0 @@
---
apiVersion: storage.k8s.io/v1
kind: StorageClass
metadata:
name: gluster
annotations:
storageclass.beta.kubernetes.io/is-default-class: "true"
provisioner: kubernetes.io/glusterfs
parameters:
resturl: "http://{{ endpoint_address }}:8080"
restuser: "admin"
restuserkey: "{{ heketi_admin_key }}"

View File

@@ -1,34 +0,0 @@
{
"clusters": [
{
"nodes": [
{% set nodeblocks = [] %}
{% for node in nodes %}
{% set nodeblock %}
{
"node": {
"hostnames": {
"manage": [
"{{ node }}"
],
"storage": [
"{{ hostvars[node].ip }}"
]
},
"zone": 1
},
"devices": [
{
"name": "{{ hostvars[node]['disk_volume_device_1'] }}",
"destroydata": false
}
]
}
{% endset %}
{% if nodeblocks.append(nodeblock) %}{% endif %}
{% endfor %}
{{ nodeblocks|join(',') }}
]
}
]
}

View File

@@ -1,2 +0,0 @@
---
heketi_remove_lvm: false

View File

@@ -1,52 +0,0 @@
---
- name: "Install lvm utils (RedHat)"
become: true
package:
name: "lvm2"
state: "present"
when: "ansible_os_family == 'RedHat'"
- name: "Install lvm utils (Debian)"
become: true
apt:
name: "lvm2"
state: "present"
when: "ansible_os_family == 'Debian'"
- name: "Get volume group information."
environment:
PATH: "{{ ansible_env.PATH }}:/sbin" # Make sure we can workaround RH / CentOS conservative path management
become: true
shell: "pvs {{ disk_volume_device_1 }} --option vg_name | tail -n+2"
register: "volume_groups"
ignore_errors: true # noqa ignore-errors
changed_when: false
- name: "Remove volume groups."
environment:
PATH: "{{ ansible_env.PATH }}:/sbin" # Make sure we can workaround RH / CentOS conservative path management
become: true
command: "vgremove {{ volume_group }} --yes"
with_items: "{{ volume_groups.stdout_lines }}"
loop_control: { loop_var: "volume_group" }
- name: "Remove physical volume from cluster disks."
environment:
PATH: "{{ ansible_env.PATH }}:/sbin" # Make sure we can workaround RH / CentOS conservative path management
become: true
command: "pvremove {{ disk_volume_device_1 }} --yes"
ignore_errors: true # noqa ignore-errors
- name: "Remove lvm utils (RedHat)"
become: true
package:
name: "lvm2"
state: "absent"
when: "ansible_os_family == 'RedHat' and heketi_remove_lvm"
- name: "Remove lvm utils (Debian)"
become: true
apt:
name: "lvm2"
state: "absent"
when: "ansible_os_family == 'Debian' and heketi_remove_lvm"

View File

@@ -1,51 +0,0 @@
---
- name: Remove storage class.
command: "{{ bin_dir }}/kubectl delete storageclass gluster"
ignore_errors: true # noqa ignore-errors
- name: Tear down heketi.
command: "{{ bin_dir }}/kubectl delete all,service,jobs,deployment,secret --selector=\"glusterfs=heketi-pod\""
ignore_errors: true # noqa ignore-errors
- name: Tear down heketi.
command: "{{ bin_dir }}/kubectl delete all,service,jobs,deployment,secret --selector=\"glusterfs=heketi-deployment\""
ignore_errors: true # noqa ignore-errors
- name: Tear down bootstrap.
include_tasks: "../../provision/tasks/bootstrap/tear-down.yml"
- name: Ensure there is nothing left over.
command: "{{ bin_dir }}/kubectl get all,service,jobs,deployment,secret --selector=\"glusterfs=heketi-pod\" -o=json"
register: "heketi_result"
until: "heketi_result.stdout | from_json | json_query('items[*]') | length == 0"
retries: 60
delay: 5
- name: Ensure there is nothing left over.
command: "{{ bin_dir }}/kubectl get all,service,jobs,deployment,secret --selector=\"glusterfs=heketi-deployment\" -o=json"
register: "heketi_result"
until: "heketi_result.stdout | from_json | json_query('items[*]') | length == 0"
retries: 60
delay: 5
- name: Tear down glusterfs.
command: "{{ bin_dir }}/kubectl delete daemonset.extensions/glusterfs"
ignore_errors: true # noqa ignore-errors
- name: Remove heketi storage service.
command: "{{ bin_dir }}/kubectl delete service heketi-storage-endpoints"
ignore_errors: true # noqa ignore-errors
- name: Remove heketi gluster role binding
command: "{{ bin_dir }}/kubectl delete clusterrolebinding heketi-gluster-admin"
ignore_errors: true # noqa ignore-errors
- name: Remove heketi config secret
command: "{{ bin_dir }}/kubectl delete secret heketi-config-secret"
ignore_errors: true # noqa ignore-errors
- name: Remove heketi db backup
command: "{{ bin_dir }}/kubectl delete secret heketi-db-backup"
ignore_errors: true # noqa ignore-errors
- name: Remove heketi service account
command: "{{ bin_dir }}/kubectl delete serviceaccount heketi-service-account"
ignore_errors: true # noqa ignore-errors
- name: Get secrets
command: "{{ bin_dir }}/kubectl get secrets --output=\"json\""
register: "secrets"
changed_when: false
- name: Remove heketi storage secret
vars: { storage_query: "items[?metadata.annotations.\"kubernetes.io/service-account.name\"=='heketi-service-account'].metadata.name|[0]" }
command: "{{ bin_dir }}/kubectl delete secret {{ secrets.stdout | from_json | json_query(storage_query) }}"
when: "storage_query is defined"
ignore_errors: true # noqa ignore-errors

View File

@@ -31,7 +31,7 @@ manage-offline-container-images.sh register
## generate_list.sh
This script generates the list of downloaded files and the list of container images by `roles/kubespray-defaults/defaults/main/download.yml` file.
This script generates the list of downloaded files and the list of container images by `roles/kubespray_defaults/defaults/main/download.yml` file.
Run this script will execute `generate_list.yml` playbook in kubespray root directory and generate four files,
all downloaded files url in files.list, all container images in images.list, jinja2 templates in *.template.

View File

@@ -5,7 +5,7 @@ CURRENT_DIR=$(cd $(dirname $0); pwd)
TEMP_DIR="${CURRENT_DIR}/temp"
REPO_ROOT_DIR="${CURRENT_DIR%/contrib/offline}"
: ${DOWNLOAD_YML:="roles/kubespray-defaults/defaults/main/download.yml"}
: ${DOWNLOAD_YML:="roles/kubespray_defaults/defaults/main/download.yml"}
mkdir -p ${TEMP_DIR}
@@ -19,12 +19,12 @@ sed -n '/^downloads:/,/download_defaults:/p' ${REPO_ROOT_DIR}/${DOWNLOAD_YML} \
| sed 'N;s#\n# #g' | tr ' ' ':' | sed 's/\"//g' > ${TEMP_DIR}/images.list.template
# add kube-* images to images list template
# Those container images are downloaded by kubeadm, then roles/kubespray-defaults/defaults/main/download.yml
# Those container images are downloaded by kubeadm, then roles/kubespray_defaults/defaults/main/download.yml
# doesn't contain those images. That is reason why here needs to put those images into the
# list separately.
KUBE_IMAGES="kube-apiserver kube-controller-manager kube-scheduler kube-proxy"
for i in $KUBE_IMAGES; do
echo "{{ kube_image_repo }}/$i:{{ kube_version }}" >> ${TEMP_DIR}/images.list.template
echo "{{ kube_image_repo }}/$i:v{{ kube_version }}" >> ${TEMP_DIR}/images.list.template
done
# run ansible to expand templates

View File

@@ -5,7 +5,7 @@
roles:
# Just load default variables from roles.
- role: kubespray-defaults
- role: kubespray_defaults
when: false
- role: download
when: false

View File

@@ -127,7 +127,7 @@ function register_container_images() {
tar -zxvf ${IMAGE_TAR_FILE}
if [ "${create_registry}" ]; then
if ${create_registry}; then
sudo ${runtime} load -i ${IMAGE_DIR}/registry-latest.tar
set +e
@@ -148,7 +148,7 @@ function register_container_images() {
if [ "${org_image}" == "ID:" ]; then
org_image=$(echo "${load_image}" | awk '{print $4}')
fi
image_id=$(sudo ${runtime} image inspect ${org_image} | grep "\"Id\":" | awk -F: '{print $3}'| sed s/'\",'//)
image_id=$(sudo ${runtime} image inspect --format "{{.Id}}" "${org_image}")
if [ -z "${file_name}" ]; then
echo "Failed to get file_name for line ${line}"
exit 1

View File

@@ -1,5 +1,11 @@
terraform {
required_version = ">= 0.12.0"
required_providers {
aws = {
source = "hashicorp/aws"
version = "~> 5.0"
}
}
}
provider "aws" {

View File

@@ -1,246 +0,0 @@
# Kubernetes on Equinix Metal with Terraform
Provision a Kubernetes cluster with [Terraform](https://www.terraform.io) on
[Equinix Metal](https://metal.equinix.com) ([formerly Packet](https://blog.equinix.com/blog/2020/10/06/equinix-metal-metal-and-more/)).
## Status
This will install a Kubernetes cluster on Equinix Metal. It should work in all locations and on most server types.
## Approach
The terraform configuration inspects variables found in
[variables.tf](variables.tf) to create resources in your Equinix Metal project.
There is a [python script](../terraform.py) that reads the generated`.tfstate`
file to generate a dynamic inventory that is consumed by [cluster.yml](../../../cluster.yml)
to actually install Kubernetes with Kubespray.
### Kubernetes Nodes
You can create many different kubernetes topologies by setting the number of
different classes of hosts.
- Master nodes with etcd: `number_of_k8s_masters` variable
- Master nodes without etcd: `number_of_k8s_masters_no_etcd` variable
- Standalone etcd hosts: `number_of_etcd` variable
- Kubernetes worker nodes: `number_of_k8s_nodes` variable
Note that the Ansible script will report an invalid configuration if you wind up
with an *even number* of etcd instances since that is not a valid configuration. This
restriction includes standalone etcd nodes that are deployed in a cluster along with
master nodes with etcd replicas. As an example, if you have three master nodes with
etcd replicas and three standalone etcd nodes, the script will fail since there are
now six total etcd replicas.
## Requirements
- [Install Terraform](https://www.terraform.io/intro/getting-started/install.html)
- [Install Ansible dependencies](/docs/ansible/ansible.md#installing-ansible)
- Account with Equinix Metal
- An SSH key pair
## SSH Key Setup
An SSH keypair is required so Ansible can access the newly provisioned nodes (Equinix Metal hosts). By default, the public SSH key defined in cluster.tfvars will be installed in authorized_key on the newly provisioned nodes (~/.ssh/id_rsa.pub). Terraform will upload this public key and then it will be distributed out to all the nodes. If you have already set this public key in Equinix Metal (i.e. via the portal), then set the public keyfile name in cluster.tfvars to blank to prevent the duplicate key from being uploaded which will cause an error.
If you don't already have a keypair generated (~/.ssh/id_rsa and ~/.ssh/id_rsa.pub), then a new keypair can be generated with the command:
```ShellSession
ssh-keygen -f ~/.ssh/id_rsa
```
## Terraform
Terraform will be used to provision all of the Equinix Metal resources with base software as appropriate.
### Configuration
#### Inventory files
Create an inventory directory for your cluster by copying the existing sample and linking the `hosts` script (used to build the inventory based on Terraform state):
```ShellSession
cp -LRp contrib/terraform/equinix/sample-inventory inventory/$CLUSTER
cd inventory/$CLUSTER
ln -s ../../contrib/terraform/equinix/hosts
```
This will be the base for subsequent Terraform commands.
#### Equinix Metal API access
Your Equinix Metal API key must be available in the `METAL_AUTH_TOKEN` environment variable.
This key is typically stored outside of the code repo since it is considered secret.
If someone gets this key, they can startup/shutdown hosts in your project!
For more information on how to generate an API key or find your project ID, please see
[Accounts Index](https://metal.equinix.com/developers/docs/accounts/).
The Equinix Metal Project ID associated with the key will be set later in `cluster.tfvars`.
For more information about the API, please see [Equinix Metal API](https://metal.equinix.com/developers/api/).
For more information about terraform provider authentication, please see [the equinix provider documentation](https://registry.terraform.io/providers/equinix/equinix/latest/docs).
Example:
```ShellSession
export METAL_AUTH_TOKEN="Example-API-Token"
```
Note that to deploy several clusters within the same project you need to use [terraform workspace](https://www.terraform.io/docs/state/workspaces.html#using-workspaces).
#### Cluster variables
The construction of the cluster is driven by values found in
[variables.tf](variables.tf).
For your cluster, edit `inventory/$CLUSTER/cluster.tfvars`.
The `cluster_name` is used to set a tag on each server deployed as part of this cluster.
This helps when identifying which hosts are associated with each cluster.
While the defaults in variables.tf will successfully deploy a cluster, it is recommended to set the following values:
- cluster_name = the name of the inventory directory created above as $CLUSTER
- equinix_metal_project_id = the Equinix Metal Project ID associated with the Equinix Metal API token above
#### Enable localhost access
Kubespray will pull down a Kubernetes configuration file to access this cluster by enabling the
`kubeconfig_localhost: true` in the Kubespray configuration.
Edit `inventory/$CLUSTER/group_vars/k8s_cluster/k8s_cluster.yml` and comment back in the following line and change from `false` to `true`:
`\# kubeconfig_localhost: false`
becomes:
`kubeconfig_localhost: true`
Once the Kubespray playbooks are run, a Kubernetes configuration file will be written to the local host at `inventory/$CLUSTER/artifacts/admin.conf`
#### Terraform state files
In the cluster's inventory folder, the following files might be created (either by Terraform
or manually), to prevent you from pushing them accidentally they are in a
`.gitignore` file in the `contrib/terraform/equinix` directory :
- `.terraform`
- `.tfvars`
- `.tfstate`
- `.tfstate.backup`
- `.lock.hcl`
You can still add them manually if you want to.
### Initialization
Before Terraform can operate on your cluster you need to install the required
plugins. This is accomplished as follows:
```ShellSession
cd inventory/$CLUSTER
terraform -chdir=../../contrib/terraform/metal init -var-file=cluster.tfvars
```
This should finish fairly quickly telling you Terraform has successfully initialized and loaded necessary modules.
### Provisioning cluster
You can apply the Terraform configuration to your cluster with the following command
issued from your cluster's inventory directory (`inventory/$CLUSTER`):
```ShellSession
terraform -chdir=../../contrib/terraform/equinix apply -var-file=cluster.tfvars
export ANSIBLE_HOST_KEY_CHECKING=False
ansible-playbook -i hosts ../../cluster.yml
```
### Destroying cluster
You can destroy your new cluster with the following command issued from the cluster's inventory directory:
```ShellSession
terraform -chdir=../../contrib/terraform/equinix destroy -var-file=cluster.tfvars
```
If you've started the Ansible run, it may also be a good idea to do some manual cleanup:
- Remove SSH keys from the destroyed cluster from your `~/.ssh/known_hosts` file
- Clean up any temporary cache files: `rm /tmp/$CLUSTER-*`
### Debugging
You can enable debugging output from Terraform by setting `TF_LOG` to `DEBUG` before running the Terraform command.
## Ansible
### Node access
#### SSH
Ensure your local ssh-agent is running and your ssh key has been added. This
step is required by the terraform provisioner:
```ShellSession
eval $(ssh-agent -s)
ssh-add ~/.ssh/id_rsa
```
If you have deployed and destroyed a previous iteration of your cluster, you will need to clear out any stale keys from your SSH "known hosts" file ( `~/.ssh/known_hosts`).
#### Test access
Make sure you can connect to the hosts. Note that Flatcar Container Linux by Kinvolk will have a state `FAILED` due to Python not being present. This is okay, because Python will be installed during bootstrapping, so long as the hosts are not `UNREACHABLE`.
```ShellSession
$ ansible -i inventory/$CLUSTER/hosts -m ping all
example-k8s_node-1 | SUCCESS => {
"changed": false,
"ping": "pong"
}
example-etcd-1 | SUCCESS => {
"changed": false,
"ping": "pong"
}
example-k8s-master-1 | SUCCESS => {
"changed": false,
"ping": "pong"
}
```
If it fails try to connect manually via SSH. It could be something as simple as a stale host key.
### Deploy Kubernetes
```ShellSession
ansible-playbook --become -i inventory/$CLUSTER/hosts cluster.yml
```
This will take some time as there are many tasks to run.
## Kubernetes
### Set up kubectl
- [Install kubectl](https://kubernetes.io/docs/tasks/tools/install-kubectl/) on the localhost.
- Verify that Kubectl runs correctly
```ShellSession
kubectl version
```
- Verify that the Kubernetes configuration file has been copied over
```ShellSession
cat inventory/alpha/$CLUSTER/admin.conf
```
- Verify that all the nodes are running correctly.
```ShellSession
kubectl version
kubectl --kubeconfig=inventory/$CLUSTER/artifacts/admin.conf get nodes
```
## What's next
Try out your new Kubernetes cluster with the [Hello Kubernetes service](https://kubernetes.io/docs/tasks/access-application-cluster/service-access-application-cluster/).

View File

@@ -1 +0,0 @@
../terraform.py

View File

@@ -1,57 +0,0 @@
resource "equinix_metal_ssh_key" "k8s" {
count = var.public_key_path != "" ? 1 : 0
name = "kubernetes-${var.cluster_name}"
public_key = chomp(file(var.public_key_path))
}
resource "equinix_metal_device" "k8s_master" {
depends_on = [equinix_metal_ssh_key.k8s]
count = var.number_of_k8s_masters
hostname = "${var.cluster_name}-k8s-master-${count.index + 1}"
plan = var.plan_k8s_masters
metro = var.metro
operating_system = var.operating_system
billing_cycle = var.billing_cycle
project_id = var.equinix_metal_project_id
tags = ["cluster-${var.cluster_name}", "k8s_cluster", "kube_control_plane", "etcd", "kube_node"]
}
resource "equinix_metal_device" "k8s_master_no_etcd" {
depends_on = [equinix_metal_ssh_key.k8s]
count = var.number_of_k8s_masters_no_etcd
hostname = "${var.cluster_name}-k8s-master-${count.index + 1}"
plan = var.plan_k8s_masters_no_etcd
metro = var.metro
operating_system = var.operating_system
billing_cycle = var.billing_cycle
project_id = var.equinix_metal_project_id
tags = ["cluster-${var.cluster_name}", "k8s_cluster", "kube_control_plane"]
}
resource "equinix_metal_device" "k8s_etcd" {
depends_on = [equinix_metal_ssh_key.k8s]
count = var.number_of_etcd
hostname = "${var.cluster_name}-etcd-${count.index + 1}"
plan = var.plan_etcd
metro = var.metro
operating_system = var.operating_system
billing_cycle = var.billing_cycle
project_id = var.equinix_metal_project_id
tags = ["cluster-${var.cluster_name}", "etcd"]
}
resource "equinix_metal_device" "k8s_node" {
depends_on = [equinix_metal_ssh_key.k8s]
count = var.number_of_k8s_nodes
hostname = "${var.cluster_name}-k8s-node-${count.index + 1}"
plan = var.plan_k8s_nodes
metro = var.metro
operating_system = var.operating_system
billing_cycle = var.billing_cycle
project_id = var.equinix_metal_project_id
tags = ["cluster-${var.cluster_name}", "k8s_cluster", "kube_node"]
}

View File

@@ -1,15 +0,0 @@
output "k8s_masters" {
value = equinix_metal_device.k8s_master.*.access_public_ipv4
}
output "k8s_masters_no_etc" {
value = equinix_metal_device.k8s_master_no_etcd.*.access_public_ipv4
}
output "k8s_etcds" {
value = equinix_metal_device.k8s_etcd.*.access_public_ipv4
}
output "k8s_nodes" {
value = equinix_metal_device.k8s_node.*.access_public_ipv4
}

View File

@@ -1,17 +0,0 @@
terraform {
required_version = ">= 1.0.0"
provider_meta "equinix" {
module_name = "kubespray"
}
required_providers {
equinix = {
source = "equinix/equinix"
version = "1.24.0"
}
}
}
# Configure the Equinix Metal Provider
provider "equinix" {
}

View File

@@ -1,35 +0,0 @@
# your Kubernetes cluster name here
cluster_name = "mycluster"
# Your Equinix Metal project ID. See https://metal.equinix.com/developers/docs/accounts/
equinix_metal_project_id = "Example-Project-Id"
# The public SSH key to be uploaded into authorized_keys in bare metal Equinix Metal nodes provisioned
# leave this value blank if the public key is already setup in the Equinix Metal project
# Terraform will complain if the public key is setup in Equinix Metal
public_key_path = "~/.ssh/id_rsa.pub"
# Equinix interconnected bare metal across our global metros.
metro = "da"
# operating_system
operating_system = "ubuntu_22_04"
# standalone etcds
number_of_etcd = 0
plan_etcd = "t1.small.x86"
# masters
number_of_k8s_masters = 1
number_of_k8s_masters_no_etcd = 0
plan_k8s_masters = "t1.small.x86"
plan_k8s_masters_no_etcd = "t1.small.x86"
# nodes
number_of_k8s_nodes = 2
plan_k8s_nodes = "t1.small.x86"

View File

@@ -1 +0,0 @@
../../../../inventory/sample/group_vars

View File

@@ -1,56 +0,0 @@
variable "cluster_name" {
default = "kubespray"
}
variable "equinix_metal_project_id" {
description = "Your Equinix Metal project ID. See https://metal.equinix.com/developers/docs/accounts/"
}
variable "operating_system" {
default = "ubuntu_22_04"
}
variable "public_key_path" {
description = "The path of the ssh pub key"
default = "~/.ssh/id_rsa.pub"
}
variable "billing_cycle" {
default = "hourly"
}
variable "metro" {
default = "da"
}
variable "plan_k8s_masters" {
default = "c3.small.x86"
}
variable "plan_k8s_masters_no_etcd" {
default = "c3.small.x86"
}
variable "plan_etcd" {
default = "c3.small.x86"
}
variable "plan_k8s_nodes" {
default = "c3.medium.x86"
}
variable "number_of_k8s_masters" {
default = 1
}
variable "number_of_k8s_masters_no_etcd" {
default = 0
}
variable "number_of_etcd" {
default = 0
}
variable "number_of_k8s_nodes" {
default = 1
}

View File

@@ -102,7 +102,8 @@ Please read the instructions in both repos on how to install it.
You can teardown your infrastructure using the following Terraform command:
```bash
terraform destroy --var-file default.tfvars ../../contrib/terraform/hetzner
cd ./kubespray
terraform -chdir=./contrib/terraform/hetzner/ destroy --var-file=../../../inventory/$CLUSTER/default.tfvars
```
## Variables

View File

@@ -2,35 +2,6 @@
Provision a Kubernetes cluster on [UpCloud](https://upcloud.com/) using Terraform and Kubespray
## Overview
The setup looks like following
```text
Kubernetes cluster
+--------------------------+
| +--------------+ |
| | +--------------+ |
| --> | | | |
| | | Master/etcd | |
| | | node(s) | |
| +-+ | |
| +--------------+ |
| ^ |
| | |
| v |
| +--------------+ |
| | +--------------+ |
| --> | | | |
| | | Worker | |
| | | node(s) | |
| +-+ | |
| +--------------+ |
+--------------------------+
```
The nodes uses a private network for node to node communication and a public interface for all external communication.
## Requirements
* Terraform 0.13.0 or newer
@@ -100,6 +71,8 @@ terraform destroy --var-file cluster-settings.tfvars \
* `template_name`: The name or UUID of a base image
* `username`: a user to access the nodes, defaults to "ubuntu"
* `private_network_cidr`: CIDR to use for the private network, defaults to "172.16.0.0/24"
* `dns_servers`: DNS servers that will be used by the nodes. Until [this is solved](https://github.com/UpCloudLtd/terraform-provider-upcloud/issues/562) this is done using user_data to reconfigure resolved. Defaults to `[]`
* `use_public_ips`: If a NIC connencted to the Public network should be attached to all nodes by default. Can be overridden by `force_public_ip` if this is set to `false`. Defaults to `true`
* `ssh_public_keys`: List of public SSH keys to install on all machines
* `zone`: The zone where to run the cluster
* `machines`: Machines to provision. Key of this object will be used as the name of the machine
@@ -108,6 +81,8 @@ terraform destroy --var-file cluster-settings.tfvars \
* `cpu`: number of cpu cores
* `mem`: memory size in MB
* `disk_size`: The size of the storage in GB
* `force_public_ip`: If `use_public_ips` is set to `false`, this forces a public NIC onto the machine anyway when set to `true`. Useful if you're migrating from public nodes to only private. Defaults to `false`
* `dns_servers`: This works the same way as the global `dns_severs` but only applies to a single node. If set to `[]` while the global `dns_servers` is set to something else, then it will not add the user_data and thus will not be recreated. Useful if you're migrating from public nodes to only private. Defaults to `null`
* `additional_disks`: Additional disks to attach to the node.
* `size`: The size of the additional disk in GB
* `tier`: The tier of disk to use (`maxiops` is the only one you can choose atm)
@@ -139,6 +114,7 @@ terraform destroy --var-file cluster-settings.tfvars \
* `port`: Port to load balance.
* `target_port`: Port to the backend servers.
* `backend_servers`: List of servers that traffic to the port should be forwarded to.
* `proxy_protocol`: If the loadbalancer should set up the backend using proxy protocol.
* `router_enable`: If a router should be connected to the private network or not
* `gateways`: Gateways that should be connected to the router, requires router_enable is set to true
* `features`: List of features for the gateway
@@ -171,3 +147,27 @@ terraform destroy --var-file cluster-settings.tfvars \
* `server_groups`: Group servers together
* `servers`: The servers that should be included in the group.
* `anti_affinity_policy`: Defines if a server group is an anti-affinity group. Setting this to "strict" or yes" will result in all servers in the group being placed on separate compute hosts. The value can be "strict", "yes" or "no". "strict" refers to strict policy doesn't allow servers in the same server group to be on the same host. "yes" refers to best-effort policy and tries to put servers on different hosts, but this is not guaranteed.
## Migration
When `null_resource.inventories` and `data.template_file.inventory` was changed to `local_file.inventory` the old state file needs to be cleaned of the old state.
The error messages you'll see if you encounter this is:
```text
Error: failed to read schema for null_resource.inventories in registry.terraform.io/hashicorp/null: failed to instantiate provider "registry.terraform.io/hashicorp/null" to obtain schema: unavailable provider "registry.terraform.io/hashicorp/null"
Error: failed to read schema for data.template_file.inventory in registry.terraform.io/hashicorp/template: failed to instantiate provider "registry.terraform.io/hashicorp/template" to obtain schema: unavailable provider "registry.terraform.io/hashicorp/template"
```
This can be fixed with the following lines
```bash
terraform state rm -state=terraform.tfstate null_resource.inventories
terraform state rm -state=terraform.tfstate data.template_file.inventory
```
### Public to Private only migration
Since there's no way to remove the public NIC on a machine without recreating its private NIC it's not possible to inplace change a cluster to only use private IPs.
The way to migrate is to first set `use_public_ips` to `false`, `dns_servers` to some DNS servers and then update all existing servers to have `force_public_ip` set to `true` and `dns_severs` set to `[]`.
After that you can add new nodes without `force_public_ip` and `dns_servers` set and create them.
Add the new nodes into the cluster and when all of them are added, remove the old nodes.

View File

@@ -122,11 +122,11 @@ k8s_allowed_remote_ips = [
master_allowed_ports = []
worker_allowed_ports = []
loadbalancer_enabled = false
loadbalancer_plan = "development"
loadbalancer_proxy_protocol = false
loadbalancer_enabled = false
loadbalancer_plan = "development"
loadbalancers = {
# "http" : {
# "proxy_protocol" : false
# "port" : 80,
# "target_port" : 80,
# "backend_servers" : [

View File

@@ -20,24 +20,26 @@ module "kubernetes" {
username = var.username
private_network_cidr = var.private_network_cidr
dns_servers = var.dns_servers
use_public_ips = var.use_public_ips
machines = var.machines
ssh_public_keys = var.ssh_public_keys
firewall_enabled = var.firewall_enabled
firewall_default_deny_in = var.firewall_default_deny_in
firewall_default_deny_out = var.firewall_default_deny_out
master_allowed_remote_ips = var.master_allowed_remote_ips
k8s_allowed_remote_ips = var.k8s_allowed_remote_ips
master_allowed_ports = var.master_allowed_ports
worker_allowed_ports = var.worker_allowed_ports
firewall_enabled = var.firewall_enabled
firewall_default_deny_in = var.firewall_default_deny_in
firewall_default_deny_out = var.firewall_default_deny_out
master_allowed_remote_ips = var.master_allowed_remote_ips
k8s_allowed_remote_ips = var.k8s_allowed_remote_ips
bastion_allowed_remote_ips = var.bastion_allowed_remote_ips
master_allowed_ports = var.master_allowed_ports
worker_allowed_ports = var.worker_allowed_ports
loadbalancer_enabled = var.loadbalancer_enabled
loadbalancer_plan = var.loadbalancer_plan
loadbalancer_outbound_proxy_protocol = var.loadbalancer_proxy_protocol ? "v2" : ""
loadbalancer_legacy_network = var.loadbalancer_legacy_network
loadbalancers = var.loadbalancers
loadbalancer_enabled = var.loadbalancer_enabled
loadbalancer_plan = var.loadbalancer_plan
loadbalancer_legacy_network = var.loadbalancer_legacy_network
loadbalancers = var.loadbalancers
router_enable = var.router_enable
gateways = var.gateways
@@ -52,32 +54,12 @@ module "kubernetes" {
# Generate ansible inventory
#
data "template_file" "inventory" {
template = file("${path.module}/templates/inventory.tpl")
vars = {
connection_strings_master = join("\n", formatlist("%s ansible_user=ubuntu ansible_host=%s ip=%s etcd_member_name=etcd%d",
keys(module.kubernetes.master_ip),
values(module.kubernetes.master_ip).*.public_ip,
values(module.kubernetes.master_ip).*.private_ip,
range(1, length(module.kubernetes.master_ip) + 1)))
connection_strings_worker = join("\n", formatlist("%s ansible_user=ubuntu ansible_host=%s ip=%s",
keys(module.kubernetes.worker_ip),
values(module.kubernetes.worker_ip).*.public_ip,
values(module.kubernetes.worker_ip).*.private_ip))
list_master = join("\n", formatlist("%s",
keys(module.kubernetes.master_ip)))
list_worker = join("\n", formatlist("%s",
keys(module.kubernetes.worker_ip)))
}
}
resource "null_resource" "inventories" {
provisioner "local-exec" {
command = "echo '${data.template_file.inventory.rendered}' > ${var.inventory_file}"
}
triggers = {
template = data.template_file.inventory.rendered
}
resource "local_file" "inventory" {
content = templatefile("${path.module}/templates/inventory.tpl", {
master_ip = module.kubernetes.master_ip
worker_ip = module.kubernetes.worker_ip
bastion_ip = module.kubernetes.bastion_ip
username = var.username
})
filename = var.inventory_file
}

Some files were not shown because too many files have changed in this diff Show More