* Remove checks for docs using exact tags
Instead use a more generic documentation for installing kubespray as a
collection from git.
* Check that we upgraded galaxy.yml to next version
This is only intented to check for human error. The version in galaxy
should be the next (which does not mean the same if we're on master or a
release branch).
* Set collection version to KUBESPRAY_NEXT_VERSION
The blockSize attribute from Calico IPPool resources cannot be changed
once set [1]. Consequently, we use the one currently defined when
configuring the existing IPPool, avoiding upgrade errors by trying to
change it.
In particular, this can be useful when calico_pool_blocksize default
changes in kubespray, which would otherwise force users to add an
explicit setting to their inventories.
[1]: https://docs.tigera.io/calico/latest/reference/resources/ippool#spec
* [kubernetes] Add hashes for kubernetes 1.26.11, 1.26.10
Make kubernetes 1.26.11 default
* Workaround for yaml/pyyaml#601
* Convert exoscale tf provider to new version (#10646)
This is untested. It passes terraform validate to un-broke the CI.
* Update 0040-verify-settings.yml (#10699)
remove embedded template
* Use supported version of fedora in CI (#10108)
* tests: replace fedora35 with fedora37
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* tests: replace fedora36 with fedora38
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* docs: update fedora version in docs
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* molecule: upgrade fedora version
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* tests: upgrade fedora images for vagrant and kubevirt
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* vagrant: workaround to fix private network ip address in fedora
Fedora stop supporting syconfig network script so we added a workaround
here
https://github.com/hashicorp/vagrant/issues/12762#issuecomment-1535957837
to fix it.
* netowrkmanager: do not configure dns if using systemd-resolved
We should not configure dns if we point to systemd-resolved.
Systemd-resolved is using NetworkManager to infer the upstream DNS
server so if we set NetworkManager to 127.0.0.53 it will prevent
systemd-resolved to get the correct network DNS server.
Thus if we are in this case we just don't set this setting.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* image-builder: update centos7 image
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* gitlab-ci: mark fedora packet jobs as allow failure
Fedora networking is still broken on Packet, let's mark it as allow
failure for now.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
---------
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
---------
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
Co-authored-by: piwinkler <9642809+piwinkler@users.noreply.github.com>
Co-authored-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
``
"msg": "Failed to template loop_control.label: 'ansible.utils.unsafe_proxy.AnsibleUnsafeText object' has no attribute 'item'. 'ansible.utils.unsafe_proxy.AnsibleUnsafeText object' has no attribute 'item'", "skip_reason": "Conditional result was False"}
``
fixes case when multus should NOT be included.
Calling bootstrap in facts.yaml so that we can always collect facts even on
new nodes. This is useful when you want to add nodes to an inventory
beforehand and then collect facts and scale the cluster with the scale
playbook and --limits. With dynamic inventory sometimes it might be more
difficult to add the nodes after running the facts playbook in this
specific situation.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
This feature no longer works on Ansible 6 / ansible-core 2.13. We do not
support these version officially yet but this will help for the future
upgrade and may help some people running those inadvertently.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
- Test with new version: 37.20230322.3.0. Both containerd and
cri-o is tested
- bugfix: when we use crio and the var bin_dir is changed,
there will be some error about the new bin dir.
* remove-debian9-support
* Add six module into openstack-cleanup/requirements.txt (#10099)
To fix tf-elastx_cleanup job which was failed with the following error:
File "/usr/local/lib/python3.11/site-packages/keystoneauth1/identity/generic/password.py", line 16, in <module>
from keystoneauth1.identity import v3
File "/usr/local/lib/python3.11/site-packages/keystoneauth1/identity/v3/__init__.py", line 27, in <module>
from keystoneauth1.identity.v3.oauth2_mtls_client_credential import * # noqa
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/keystoneauth1/identity/v3/oauth2_mtls_client_credential.py", line 17, in <module>
import six
ModuleNotFoundError: No module named 'six'
---------
Co-authored-by: Kenichi Omichi <ken1ohmichi@gmail.com>
According to the canal github[1] the repo is not maintained over 5 years.
In addition, the README says
```
Originally, we thought we might more deeply integrate the two projects
(possibly even going as far as a rebranding!). However, over time it
became clear that that wasn't really necessary to fulfil our goal of
making them work well together. Ultimately, we decided to focus on
adding features to both projects rather than doing work just to
combine them.
```
So it is difficult to support canal by Kubespray at this situation.
[1]: https://github.com/projectcalico/canal
To fix tf-elastx_cleanup job which was failed with the following error:
File "/usr/local/lib/python3.11/site-packages/keystoneauth1/identity/generic/password.py", line 16, in <module>
from keystoneauth1.identity import v3
File "/usr/local/lib/python3.11/site-packages/keystoneauth1/identity/v3/__init__.py", line 27, in <module>
from keystoneauth1.identity.v3.oauth2_mtls_client_credential import * # noqa
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/usr/local/lib/python3.11/site-packages/keystoneauth1/identity/v3/oauth2_mtls_client_credential.py", line 17, in <module>
import six
ModuleNotFoundError: No module named 'six'
* Drop CI jobs related to canal
According to the canal github[1] the repo is not maintained over 5 years.
In addition, the README says
Originally, we thought we might more deeply integrate the two projects
(possibly even going as far as a rebranding!). However, over time it
became clear that that wasn't really necessary to fulfil our goal of
making them work well together. Ultimately, we decided to focus on
adding features to both projects rather than doing work just to
combine them.
So we don't need to run CI jobs related to the canal at this situation.
[1]: https://github.com/projectcalico/canal
* Update ci.md
* chore(helm-apps): fix README example
README shows a non-working example according to the specs for this role.
* Add support for kubelet-csr-approver
Co-Authored-By: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* Add tests for kubelet-csr-approver
Co-Authored-By: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* Add Documentation for Kubelet CSR Approver
Co-Authored-By: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
---------
Co-authored-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* Remove deprecated provider and fix flatcar configs
* Refactor for DRYness
* Add missing line endings
* Enable tests for hetzner terraform in CI
* Add missing inventory for CI tests
* Fix: vSphere Error: `Apply a CSI secret manifest`
This PR will fix an issue that you will see on 2nd deploy when deploying External vSphere
How to re-produce:
1. Set custom `vsphere_csi_namespace: "vmware-system-csi"`
2. Deploy as usual
3. Observe no errors
4. Deploy 2nd time without `reset`
5. Playbook fails with:
```
TASK [kubernetes-apps/csi_driver/vsphere : vSphere CSI Driver | Apply a CSI secret manifest]
fatal: [node-00]: FAILED! => changed=true
censored: 'the output has been hidden due to the fact that ''no_log: true'' was specified for this result'
```
* create namespace if does not exist
* lint fix
* try to fix lint errors
* fix `too few spaces before comment`
* change the order of applied manifests
* typo
* [cilium] fix rbac and upgrade hubble v0.11.0 (#3)
* [cilium] fix rbac for LB bgp ipam
* [cilium] Upgrade Hubble to v0.11.0 and add mTLS between Hubble UI and Hubble Relay
* fix dns domain hubble for tls
---------
Co-authored-by: Thuon Jeremy <d107869@olinfra1.infra.bdm.outscale.c1.dav.fr>
* Fix blank line
---------
Co-authored-by: Thuon Jeremy <d107869@olinfra1.infra.bdm.outscale.c1.dav.fr>
Cilium 1.13.1 changed how the cilium-cni binary gets placed in /opt/cni/bin,
so that it takes place in an init container rather than in the main agent.
This commit removes the variable `use_localhost_as_kubeapi_loadbalancer`
and rather detects that we are in a situation where we can use the
localhost apiserver loadbalancer (meaning that we use the localhost load
balancer and that the same ports are used for both the load balancer and
the kube-apiserver).
This also cleanups the calico code to use `kube_apiserver_global_endpoint`
rather than implementing the same logic all over again.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* node: fix default kubelet/runtime cgroups when kube_reserved is false (default)
Commit 1c4db6132d introduced a notion of
kube_reserved. This introduced a breaking change defaulting to use
kube.slice for the container_manager and the kubelet as if kube_reserved
was always enabled whereas it is disabled by default.
This commit fixes this by bringing back system.slice whenever
kube_reserved is disabled.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* inventory/sample: change false for kube_reserved as its the default
Changing the commented value in sample inventory to the actual default
value.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
---------
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* network_plugin/custom_cni: add CNI to apply provided manifests
Add a new simple custom_cni to install provided Kubernetes manifests.
This could be useful to use manifests directly provided by a CNI when
there are not support by Kubespray (i.e.: helm chart or any other manifests
generation method).
Co-authored-by: James Landrein <james.landrein@proton.ch>
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* network_plugin/custom_cni: add test with cilium
Co-authored-by: James Landrein <james.landrein@proton.ch>
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
---------
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
Co-authored-by: James Landrein <james.landrein@proton.ch>
crun_bin_dir was used to specify the destination of the crun binary during the
download process. This path must match with the value provided in the CRI-O
configuration file. So changing its value to bin_dir helps to mismatch errors.
Signed-off-by: Victor Morales <chipahuac@hotmail.com>
requirements-$ANSIBLE_VERSION.yml doesn't exist in Kubespray repo.
That was for supporting ansible 2.10-, and now Kubespray supports
2.11+. So this drops the part to avoid confusion.
Reducing the number of layers, increasing readability, reducing the size of the image (how much I can’t check, it’s impossible for me to build due to the unavailability of the vagrant repository)
* Fix yq install in argocd role: use download_file instead of get_url
* Fix use download_file instead of get_url to download argocd-install manifest in argocd role
* Fix order and add arm64 checksum
* Fix: Failed to template loop_control.label: 'None'
Since Kubespray v2.21.0, the commit a98ab40434 removes copying
contrib/ accidentaly. The contrib/ contains useful tools like offline
tools etc. This adds the contrib/ to Dockerfile again.
1.5.7 was released Aug 2, 2021 and 1.6.1 came out on Dec 13, 2022.
There's been a good amount of new features, improvements and fixes since
1.5.7 and the changelogs for each version are available in the docs:
https://ara.readthedocs.io/en/latest/changelog-release-notes.html
This fixes the CrashLoopBackoff error that appears because envoy
configuration has changed a lot and upstream removed the envoy proxy to
use nginx only instead. Those changes are based on upstream cilium helm.
It is quite confusing that there's an all-caps, bolded comment that seems to imply that `etcd_download_url` is relevant only when not using host-based deployment. The opposite is true: of course the artifact download URL is relevant and required for host-based etcd.
Perhaps the entire comment can be read in a different way, and should perhaps be reworded entirely, cf. 374438a3d6/docs/offline-environment.md (L38)
Removing the "**DON'T**" matches the way the other comments in this file are written and matches my personal interpretation.
This commit uses a kubeadm join config to pull down cert for etcd in
workers nodes (which is needed in some circumstances, for instance with
calico or cilium).
The previous way didn't allow us to pass certain parameters which was
typically given in the config in other kubeadm invokations in Kubespray.
This made kubeadm produced some errors for some edge cases.
For example, in our deployment we don't have a default route and even
though it's only to download the certificates, kubeadm produce an error
`unable to select an IP from default routes` (these command are kubeadm
controlplane command, so kubeadm does some additional checks). This is
fixed by specifying `advertiseAddress` within the kubeadm config.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
Add a variable `calico_felix_floatingIPs` which permit to enable calico feature `floatingIPs`
(disabled per default).
Signed-off-by: MatthieuFin <matthieu2717@gmail.com>
#9679
In 6db6c8678c, this was disabled becaue
kubesrpay gave too much permissions that were not needed. This commit
re-enable back this option by default and also removes the extra
permissions that kubespray gave that were in fact not needed.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* optimize cgroups settings for node reserved
* fix
* set cgroup slice for multi container engine
* set cgroup slice for crio
* add reserved cgroups variables to sample files
* Compatible with cgroup path for different container managers
* add cgroups doc
* fix markdown
At the upstream calico development, the v3.21 branch is not updated
over 2 monthes. In addition, unnecessary error message is output at
Kubespray deployment due to different URLs for calico v3.21 or v3.22+
This drops the v3.21 support to solve the issue.
Without minimal cluster configuration, even on a one node control plane,
the health check of the ovn-cental container always fails as it queries the
cluster/status.
Crio registries configuration changed from crio_registries_mirrors to
crio_registries. The configuration in the test was however forgotten.
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
* feat(): Add wireguard backend to flannel cni
As described in the flannel docs:
https://github.com/flannel-io/flannel/blob/master/Documentation/backends.md#wireguard
This does not support optional configuration methods like:
- setting a psk (will be autogenerated by default)
- chang listening ports
- change mode (defaults to 'separate')
- change PersistentKeepaliveInterval (defaults to 0)
* Add supported backends to flannel docs
* Fix markdown in docs
This fixes a task failure in the rescue block that uncordons nodes after an unsuccessful drain. The issue occurs when `kube_override_hostname` is set and does not match `inventory_hostname`.
The `containerd-common` role is responsible for gathering OS specific variables from the vars directory of the roles that include or import it. `containerd-common` is imported via role dependency by a total of two roles, `container-engine/docker`, and `container-engine/containerd`.
containerd-common is needed by both the docker and containerd roles as a dependency when:
- containerd is selected as the container engine
- a docker install is detected and needs to be removed
- apt is the package manager
However, by default, roles can not be invoked more than once in the same play, unless `allow_duplicates: true` is set for that role. This results in the failure of the `containerd | Remove containerd repository` task, since only the docker vars will be loaded in the play, and `containerd_repo_info.repos`, normally populated by containerd/vars, is left empty.
This change sets `allow_duplicates: true` for `containerd-common` which fixes the currently failing containerd tasks if docker was detected and removed in the same play.
During the reset, restart network was not completing in distros
like RHEL/CentOS/AlmaLinux with major version higher than 8.
Example:
kubespray> ansible-playbook -i inventory/mydomain/hosts.yml reset.yml -b -v
fatal: [mynode]: FAILED! => {"changed": false, "msg": "Could not find the requested service network: host"}
Signed-off-by: Douglas Schilling Landgraf <dlandgra@redhat.com>
Signed-off-by: Douglas Schilling Landgraf <dlandgra@redhat.com>
There is a wrong directory path to all.yml and vsphere.yml. The wrong directory is `inventory/sample/group_vars/all.yml` and `inventory/sample/group_vars/all/vsphere.yml` which should be `inventory/sample/group_vars/all/all.yml` and `inventory/sample/group_vars/all/vsphere.yml`.
When operating kubespray from kubespray image with docker run,
we need to checkout the specific kubespray version as the same as
the image, because the sample inventory contains kubernetes version
and the version of master branch could not be supported on the released
kubespray, for example.
The certified-conformance mode took 2+ hours and that was too long
by comparing Quick mode which was specified previously.
So this updates the mode to Quick again.
The latest version of sonobuoy is v0.56.11.
This updates the version to the latest.
As the file name, this makes it use certified-conformance mode
clearly for the latest version of sonobuoy.
by setting a default runtime spec with a patch for RLIMIT_NOFILE.
- Introduces containerd_base_runtime_spec_rlimit_nofile.
- Generates base_runtime_spec on-the-fly, to use the containerd version
of the node.
- Update and re-work the documentation:
- Update links
- Fix formatting (especially for lists)
- Remove documentation about `useAlphaApi`,
a flag only for k8s versions < v1.10
- Attempt to clarify the doc
- Update to version 1.5.0
- Remove PodSecurityPolicy (deprecated in k8s v1.21+)
- Update ClusterRole following upstream
(cf https://github.com/kubernetes-sigs/sig-storage-local-static-provisioner/pull/292)
- Add nodeSelector to DaemonSet (following upstream)
We made all vagrant jobs non-voting because those jobs were not stable.
However the setting allowed a pull request which broke vagrant jobs
completely merged into the master branch.
To avoid such situation, this makes one of vagrant jobs voting.
Let's see the stability of the job.
The install-cni-plugin image was not updated to the corresponding
arch when building the different DS.
Fixes issue #9460
Signed-off-by: Fred Rolland <frolland@nvidia.com>
Signed-off-by: Fred Rolland <frolland@nvidia.com>
`test` is is a internal Python package (see [doc]), and as such should not be
used here. It make tests fail in some environments.
[doc]: https://docs.python.org/3/library/test.html
* Fix inconsistent handling of admission plugin list
* Adjust hardening doc with the normalized admission plugin list
* Add pre-check for admission plugins format change
* Ignore checking admission plugins value when variable is not defined
On hardening environments, cert-manager pods could not be created
from the corresponding deployments. This adds the securityContext
to solve the issue.
To verify the hardening method works always.
The configuration comes from docs/hardening.md
Fix yaml format of hardening.yml
Add condition to skip 040 test for hardening
busybox container requires a root permission for ping.
For testing hardening method at CI, we need to switch to another image
which doesn't require the root permission for network testing.
On kubernetes/kubernetes repo, we are using agnhost which doesn't
require it. So this makes the test use aghhost image.
In addition, this updates the test manifest to specify securityContext
without any privilege.
* Avoid MetalLB speaker image download when metallb_speaker_enabled is set to
* Move metallb_speaker_enabled var to allow outside metalLB role references
* Move metallb_speaker_enabled var to allow outside metalLB role references
* Improve metallb_speaker_enabled default values
When trying to add a hardening CI job by copying configuration from
hardening.md, yamllint CI job deleted invalid format.
This fixes it for maintaining the CI job.
* Fix: install policy controller on kdd too
* Remove the calico_policy_version condition altogether
* Install policy controller both on canal and calico under same condition
* Support kubeadm patches in v1beta3
* Update kubeadm patches sample files in inventory
* Fix pre-commit syntax
* Set kubeadm_patches enabled to false in sample inventory
* Add optional NAT support in calico router mode
* Add a blank line in front of lists
* Remove mutual exclusivity: NAT and router mode
* Ignore router mode from NAT
* Update calico doc
Since the commit fad296616c cri_dockerd_enabled
has not been used. But the packet_ubuntu22-aio-docker.yml still contains
the configuration and causes confusions.
This removes the configuration for cleanup.
It seems that PR #8839 broke `calico_datastore: etcd` when it removed ipamconfig support for etcd mode.
This PR fixes some failing tasks when `calico_datastore == etcd`, but it does not restore ipamconfig support for calico in etcd mode. If someone wants to restore ipamconfig support for `calico_datastore: etcd` please submit a follow up PR for that.
For the following configuration
```
containerd_insecure_registries:
docker.io:
- dockerhubcache.example.com
```
the rendered /etc/containerd/config.toml contains
```
[plugins."io.containerd.grpc.v1.cri".registry.configs."docker.io".tls]
insecure_skip_verify = true
```
but it needs to be
```
[plugins."io.containerd.grpc.v1.cri".registry.configs."dockerhubcache.example.com".tls]
insecure_skip_verify = true
```
* Add the option to enable default Pod Security Configuration
Enable Pod Security in all namespaces by default with the option to
exempt some namespaces. Without the change only namespaces explicitly
configured will receive the admission plugin treatment.
* Fix the PR according to code review comments
* Revert the latest changes
- leave the empty file when kube_pod_security_use_default, but add comment explaining the empty file
- don't attempt magic at conditionally adding PodSecurity to kube_apiserver_admission_plugins_needs_configuration
* Add 'flush ip6tables' task in reset role
If enable_dual_stack_networks is set to true and ip6 is defined,ip6tables will be created. But when reset the kubernetes cluster, kubespray doesn't flush ip6tables.
* [CI] fix molecule tests on opensuse by upgrading to 15.4 (#9175)
* [CI] fix molecule tests on opensuse by upgrading to 15.4
* [opensuse] use correct python crytography package name depending on distribution version
Co-authored-by: Cristian Calin <6627509+cristicalin@users.noreply.github.com>
This condition blocks the creation of the `etcd` user in certain conditions.
Specifically, when you have a `etcd_deployment_type: kubeadm` and `kube_owner: root`.
Being the `root` user already present on the system, this will not be a problem (due to the idempotency of ansible).
Today we have many contributions to contrib/offline/ and some PRs
contained invalid coding style for those scripts.
This enables shellcheck to make such invalid coding style easily.
* Update main.yaml
* remove version in dpkg_selection name
* make lint happy
* Fix typo
* add comment / remove useless contition
* remove dpkg hold in reset tasks
* This release removes support for Kubernetes v1.19.0
* This release adds support for Kubernetes v1.24.0
* Starting with this release, we will need permissions on the coordination.k8s.io/leases resource for leaderelection lock
The commit 1ce2f04 tried to merge multiple SUSE OS checks including
"openSUSE Leap" and "openSUSE Tumbleweed" into a single SUSE, but
that was a perfect change.
Then the commit c16efc9 tried to fix it for "openSUSE Leap", but it
didn't take care of "openSUSE Tumbleweed".
Then this adds "openSUSE Tumbleweed" to the OS check.
* Fix vcloud-csi bug related to #9046
Signed-off-by: yasintahaerol <yasintahaerol@gmail.com>
* add supervisor-fss-namespace=kube-system flag to vsphere-csi-controller-deployment
Signed-off-by: yasintahaerol <yasintahaerol@gmail.com>
This adds target components on check_readme_versions.sh after
merging https://github.com/kubernetes-sigs/kubespray/pull/9044
In addition, this fixes typo on check_readme_versions.sh
This adds `foo_version` variables for some components because
check_readme_versions.sh verifies the corresponding version for
`<component name>_version` from main.yml. This change also makes
consistency in the main.yml. In long-term, we will be able to
remove the existing `foo_image_tag` variables, but that is not now
for backwards compatibility for users.
During code-review, reviwers needed to take care of README.md also
should be updated when the pull request updated component versions.
This adds the corresponding check to reduce reviwer's burden.
* Added new configuration item for extra tolerations in policy controllers
Signed-off-by: Sébastien Masset <smt.masset@gmail.com>
* Added new configuration item for extra tolerations in DNS autoscaler
Signed-off-by: Sébastien Masset <smt.masset@gmail.com>
* Aligned existing handling of extra DNS tolerations
Signed-off-by: Sébastien Masset <smt.masset@gmail.com>
* feat: make kubernetes owner parametrized
* docs: update hardening guide with configuration for CIS 1.1.19
* fix: set etcd data directory permissions to be compliant to CIS 1.1.12
* extra admission controls now don't have a version in their file names
eventratelimit.v1beta2.yaml.j2 -> eventratelimit.yaml.j2
* cri_socket variable includes the unix:// prefix to be conformat with
upstream
When running molecule jobs, we saw the folloing warning message:
[DEPRECATION WARNING]: [defaults]callback_whitelist option, normalizing names
to new standard, use callbacks_enabled instead. This feature will be removed
from ansible-core in version 2.15. Deprecation warnings can be disabled by
setting deprecation_warnings=False in ansible.cfg.
callbacks_enabled has been added since Ansible 2.11 and Kubespray is using
Ansible 2.12 at master branch. So we can use callbacks_enabled safely to
avoid the warning message.
* Allow disabling calico CNI logs with calico_cni_log_file_path
Calico CNI logs up to 1G if it log a lot with current default settings:
log_file_max_size 100 Max file size in MB log files can reach before they are rotated.
log_file_max_age 30 Max age in days that old log files will be kept on the host before they are removed.
log_file_max_count 10 Max number of rotated log files allowed on the host before they are cleaned up.
See https://projectcalico.docs.tigera.io/reference/cni-plugin/configuration#logging
To save disk space, make the path configurable and allow disabling this log by setting
`calico_cni_log_file_path: false`
* Fix markdown
* Update roles/network_plugin/canal/templates/cni-canal.conflist.j2
Co-authored-by: Kenichi Omichi <ken1ohmichi@gmail.com>
Co-authored-by: Kenichi Omichi <ken1ohmichi@gmail.com>
Kubespray uses `yamllint` and `ansible-lint`. To run them locally use `yamllint .` and `ansible-lint`. It is a good idea to add call these tools as part of your pre-commit hook and avoid a lot of back end forth on fixing linting issues (<https://support.gitkraken.com/working-with-repositories/githooksexample/>).
Kubespray uses [pre-commit](https://pre-commit.com) hook configuration to run several linters, please install this tool and use it to run validation tests before submitting a PR.
```ShellSession
pre-commit install
pre-commit run -a # To run pre-commit hook on all files in the repository, even if they were not modified
```
#### Molecule
@@ -33,7 +38,9 @@ Vagrant with VirtualBox or libvirt driver helps you to quickly spin test cluster
1. Submit an issue describing your proposed change to the repo in question.
2. The [repo owners](OWNERS) will respond to your issue promptly.
3. Fork the desired repo, develop and test your code changes.
4.Sign the CNCF CLA (<https://git.k8s.io/community/CLA.md#the-contributor-license-agreement>)
5.Submit a pull request.
6.Work with the reviewers on their suggestions.
7.Ensure to rebase to the HEAD of your target branch and squash un-necessary commits (<https://blog.carbonfive.com/always-squash-and-rebase-your-git-commits/>) before final merger of your contribution.
4.Install [pre-commit](https://pre-commit.com) and install it in your development repo.
5.Addess any pre-commit validation failures.
6.Sign the CNCF CLA (<https://git.k8s.io/community/CLA.md#the-contributor-license-agreement>)
7.Submit a pull request.
8. Work with the reviewers on their suggestions.
9. Ensure to rebase to the HEAD of your target branch and squash un-necessary commits (<https://blog.carbonfive.com/always-squash-and-rebase-your-git-commits/>) before final merger of your contribution.
Note: When Ansible is already installed via system packages on the control machine, other python packages installed via `sudo pip install -r requirements.txt` will go to a different directory tree (e.g. `/usr/local/lib/python2.7/dist-packages` on Ubuntu) from Ansible's (e.g. `/usr/lib/python2.7/dist-packages/ansible` still on Ubuntu).
As a consequence, `ansible-playbook` command will fail with:
Note: When Ansible is already installed via system packages on the control node,
Python packages installed via `sudo pip install -r requirements.txt` will go to
a different directory tree (e.g. `/usr/local/lib/python2.7/dist-packages` on
Ubuntu) from Ansible's (e.g. `/usr/lib/python2.7/dist-packages/ansible` still on
Ubuntu). As a consequence, the `ansible-playbook` command will fail with:
```raw
ERROR! no action detected in task. This often indicates a misspelled module name, or incorrect module path.
```
probably pointing on a task depending on a module present in requirements.txt.
This likely indicates that a task depends on a module present in ``requirements.txt``.
One way of solving this would be to uninstall the Ansible package and then, to install it via pip but it is not always possible.
A workaround consists of setting `ANSIBLE_LIBRARY` and `ANSIBLE_MODULE_UTILS` environment variables respectively to the `ansible/modules` and `ansible/module_utils` subdirectories of pip packages installation location, which can be found in the Location field of the output of `pip show [package]` before executing `ansible-playbook`.
One way of addressing this is to uninstall the system Ansible package then
reinstall Ansible via ``pip``, but this not always possible and one must
take care regarding package versions.
A workaround consists of setting the `ANSIBLE_LIBRARY`
and `ANSIBLE_MODULE_UTILS` environment variables respectively to
the `ansible/modules` and `ansible/module_utils` subdirectories of the ``pip``
installation location, which is the ``Location`` shown by running
`pip show [package]` before executing `ansible-playbook`.
A simple way to ensure you get all the correct version of Ansible is to use the [pre-built docker image from Quay](https://quay.io/repository/kubespray/kubespray?tab=tags).
You will then need to use [bind mounts](https://docs.docker.com/storage/bind-mounts/) to get the inventory and ssh key into the container, like this:
A simple way to ensure you get all the correct version of Ansible is to use
the [pre-built docker image from Quay](https://quay.io/repository/kubespray/kubespray?tab=tags).
You will then need to use [bind mounts](https://docs.docker.com/storage/bind-mounts/)
to access the inventory and SSH key in the container, like this:
```ShellSession
docker pull quay.io/kubespray/kubespray:v2.19.0
git checkout v2.22.2
docker pull quay.io/kubespray/kubespray:v2.22.2
docker run --rm -it --mount type=bind,source="$(pwd)"/inventory/sample,dst=/inventory \
- The list of available docker version is 18.09, 19.03 and 20.10. The recommended docker version is 20.10. The kubelet might break on docker's non-standard version numbering (it no longer uses semantic versioning). To ensure auto-updates don't break your cluster look into e.g. yum versionlock plugin or apt pin).
- Supported Docker versions are 18.09, 19.03 and 20.10. The *recommended* Docker version is 20.10. `Kubelet` might break on docker's non-standard version numbering (it no longer uses semantic versioning). To ensure auto-updates don't break your cluster look into e.g. the YUM ``versionlock`` plugin or ``apt pin``).
- The cri-o version should be aligned with the respective kubernetes version (i.e. kube_version=1.20.x, crio_version=1.20)
## Requirements
- **Minimum required version of Kubernetes is v1.21**
- **Ansible v2.9.x, Jinja 2.11+ and python-netaddr is installed on the machine that will run Ansible commands**
- **Minimum required version of Kubernetes is v1.24**
- **Ansible v2.11+, Jinja 2.11+ and python-netaddr is installed on the machine that will run Ansible commands**
- The target servers must have **access to the Internet** in order to pull docker images. Otherwise, additional configuration is required (See [Offline Environment](docs/offline-environment.md))
- The target servers are configured to allow **IPv4 forwarding**.
- If using IPv6 for pods and services, the target servers are configured to allow **IPv6 forwarding**.
- The **firewalls are not managed**, you'll need to implement your own rules the way you used to.
in order to avoid any issue during deployment you should disable your firewall.
- If kubespray is ran from non-root user account, correct privilege escalation method
- If kubespray is run from non-root user account, correct privilege escalation method
should be configured in the target servers. Then the `ansible_become` flag
or command parameters `--become or -b` should be specified.
Hardware:
These limits are safeguarded by Kubespray. Actual requirements for your workload can differ. For a sizing guide go to the [Building Large Clusters](https://kubernetes.io/docs/setup/cluster-large/#size-of-master-and-master-components) guide.
These limits are safeguarded by Kubespray. Actual requirements for your workload can differ. For a sizing guide go to the [Building Large Clusters](https://kubernetes.io/docs/setup/cluster-large/#size-of-master-and-master-components) guide.
- Master
- Memory: 1500 MB
@@ -183,7 +223,7 @@ These limits are safe guarded by Kubespray. Actual requirements for your workloa
## Network Plugins
You can choose between 10 network plugins. (default: `calico`, except Vagrant uses `flannel`)
You can choose among ten network plugins. (default: `calico`, except Vagrant uses `flannel`)
@@ -192,8 +232,6 @@ You can choose between 10 network plugins. (default: `calico`, except Vagrant us
and overlay networks, with or without BGP. Calico uses the same engine to enforce network policy for hosts,
pods, and (if using Istio and Envoy) applications at the service mesh layer.
- [canal](https://github.com/projectcalico/canal): a composition of calico and flannel plugins.
- [cilium](http://docs.cilium.io/en/latest/): layer 3/4 networking (as well as layer 7 to protect and secure application protocols), supports dynamic insertion of BPF bytecode into the Linux kernel to implement security services, networking and visibility logic.
- [weave](docs/weave.md): Weave is a lightweight container overlay network that doesn't require an external K/V database cluster.
@@ -210,7 +248,10 @@ You can choose between 10 network plugins. (default: `calico`, except Vagrant us
- [multus](docs/multus.md): Multus is a meta CNI plugin that provides multiple network interface support to pods. For each interface Multus delegates CNI calls to secondary CNI plugins such as Calico, macvlan, etc.
The choice is defined with the variable `kube_network_plugin`. There is also an
- [custom_cni](roles/network-plugin/custom_cni/) : You can specify some manifests that will be applied to the clusters to bring you own CNI and use non-supported ones by Kubespray.
See `tests/files/custom_cni/README.md` and `tests/files/custom_cni/values.yaml`for an example with a CNI provided by a Helm Chart.
The network plugin to use is defined by the variable `kube_network_plugin`. There is also an
option to leverage built-in cloud provider networking instead.
See also [Network checker](docs/netcheck.md).
@@ -231,10 +272,11 @@ See also [Network checker](docs/netcheck.md).
@@ -9,10 +9,10 @@ The Kubespray Project is released on an as-needed basis. The process is as follo
5. Create the release note with [Kubernetes Release Notes Generator](https://github.com/kubernetes/release/blob/master/cmd/release-notes/README.md). See the following `Release note creation` section for the details.
6. An approver creates [new release in GitHub](https://github.com/kubernetes-sigs/kubespray/releases/new) using a version and tag name like `vX.Y.Z` and attaching the release notes
7. An approver creates a release branch in the form `release-X.Y`
8. The corresponding version of [quay.io/kubespray/kubespray:vX.Y.Z](https://quay.io/repository/kubespray/kubespray) and [quay.io/kubespray/vagrant:vX.Y.Z](https://quay.io/repository/kubespray/vagrant) docker images are built and tagged
8. The corresponding version of [quay.io/kubespray/kubespray:vX.Y.Z](https://quay.io/repository/kubespray/kubespray) and [quay.io/kubespray/vagrant:vX.Y.Z](https://quay.io/repository/kubespray/vagrant) container images are built and tagged. See the following `Container image creation` section for the details.
9. The `KUBESPRAY_VERSION` variable is updated in `.gitlab-ci.yml`
10. The release issue is closed
11. An announcement email is sent to `kubernetes-dev@googlegroups.com` with the subject `[ANNOUNCE] Kubespray $VERSION is released`
11. An announcement email is sent to `dev@kubernetes.io` with the subject `[ANNOUNCE] Kubespray $VERSION is released`
12. The topic of the #kubespray channel is updated with `vX.Y.Z is released! | ...`
If the release note file(/tmp/kubespray-release-note) contains "### Uncategorized" pull requests, those pull requests don't have a valid kind label(`kind/feature`, etc.).
It is necessary to put a valid label on each pull request and run the above release-notes command again to get a better release note)
It is necessary to put a valid label on each pull request and run the above release-notes command again to get a better release note
## Container image creation
The container image `quay.io/kubespray/kubespray:vX.Y.Z` can be created from Dockerfile of the kubespray root directory:
@@ -14,12 +14,16 @@ This role performs basic installation and setup of Gluster, but it does not conf
Available variables are listed below, along with default values (see `defaults/main.yml`):
glusterfs_default_release: ""
```yaml
glusterfs_default_release:""
```
You can specify a `default_release` for apt on Debian/Ubuntu by overriding this variable. This is helpful if you need a different package or version for the main GlusterFS packages (e.g. GlusterFS 3.5.x instead of 3.2.x with the `wheezy-backports` default release on Debian Wheezy).
glusterfs_ppa_use: yes
glusterfs_ppa_version: "3.5"
```yaml
glusterfs_ppa_use:yes
glusterfs_ppa_version:"3.5"
```
For Ubuntu, specify whether to use the official Gluster PPA, and which version of the PPA to use. See Gluster's [Getting Started Guide](https://docs.gluster.org/en/latest/Quick-Start-Guide/Quickstart/) for more info.
@@ -29,9 +33,11 @@ None.
## Example Playbook
```yaml
- hosts:server
roles:
- geerlingguy.glusterfs
```
For a real-world use example, read through [Simple GlusterFS Setup with Ansible](http://www.jeffgeerling.com/blog/simple-glusterfs-setup-ansible), a blog post by this role's author, which is included in Chapter 8 of [Ansible for DevOps](https://www.ansiblefordevops.com/).
- Terraform automatically creates an Ansible Inventory file called `hosts` with the created infrastructure in the directory `inventory`
- Ansible will automatically generate an ssh config file for your bastion hosts. To connect to hosts with ssh using bastion host use generated ssh-bastion.conf.
Ansible automatically detects bastion and changes ssh_args
- Ansible will automatically generate an ssh config file for your bastion hosts. To connect to hosts with ssh using bastion host use generated `ssh-bastion.conf`. Ansible automatically detects bastion and changes `ssh_args`
@@ -12,7 +12,7 @@ This will install a Kubernetes cluster on Equinix Metal. It should work in all l
The terraform configuration inspects variables found in
[variables.tf](variables.tf) to create resources in your Equinix Metal project.
There is a [python script](../terraform.py) that reads the generated`.tfstate`
file to generate a dynamic inventory that is consumed by [cluster.yml](../../..//cluster.yml)
file to generate a dynamic inventory that is consumed by [cluster.yml](../../../cluster.yml)
to actually install Kubernetes with Kubespray.
### Kubernetes Nodes
@@ -60,16 +60,16 @@ Terraform will be used to provision all of the Equinix Metal resources with base
Create an inventory directory for your cluster by copying the existing sample and linking the `hosts` script (used to build the inventory based on Terraform state):
This will be the base for subsequent Terraform commands.
#### Equinix Metal API access
Your Equinix Metal API key must be available in the `PACKET_AUTH_TOKEN` environment variable.
Your Equinix Metal API key must be available in the `METAL_AUTH_TOKEN` environment variable.
This key is typically stored outside of the code repo since it is considered secret.
If someone gets this key, they can startup/shutdown hosts in your project!
@@ -80,10 +80,12 @@ The Equinix Metal Project ID associated with the key will be set later in `clust
For more information about the API, please see [Equinix Metal API](https://metal.equinix.com/developers/api/).
For more information about terraform provider authentication, please see [the equinix provider documentation](https://registry.terraform.io/providers/equinix/equinix/latest/docs).
Example:
```ShellSession
export PACKET_AUTH_TOKEN="Example-API-Token"
export METAL_AUTH_TOKEN="Example-API-Token"
```
Note that to deploy several clusters within the same project you need to use [terraform workspace](https://www.terraform.io/docs/state/workspaces.html#using-workspaces).
@@ -101,7 +103,7 @@ This helps when identifying which hosts are associated with each cluster.
While the defaults in variables.tf will successfully deploy a cluster, it is recommended to set the following values:
- cluster_name = the name of the inventory directory created above as $CLUSTER
- metal_project_id = the Equinix Metal Project ID associated with the Equinix Metal API token above
- equinix_metal_project_id = the Equinix Metal Project ID associated with the Equinix Metal API token above
#### Enable localhost access
@@ -119,12 +121,13 @@ Once the Kubespray playbooks are run, a Kubernetes configuration file will be wr
In the cluster's inventory folder, the following files might be created (either by Terraform
or manually), to prevent you from pushing them accidentally they are in a
`.gitignore` file in the `terraform/metal` directory :
`.gitignore` file in the `contrib/terraform/equinix` directory :
- `.terraform`
- `.tfvars`
- `.tfstate`
- `.tfstate.backup`
- `.lock.hcl`
You can still add them manually if you want to.
@@ -135,7 +138,7 @@ plugins. This is accomplished as follows:
@@ -270,6 +270,7 @@ For your cluster, edit `inventory/$CLUSTER/cluster.tfvars`.
|`supplementary_node_groups` | To add ansible groups to the nodes, such as `kube_ingress` for running ingress controller pods, empty by default. |
|`bastion_allowed_remote_ips` | List of CIDR allowed to initiate a SSH connection, `["0.0.0.0/0"]` by default |
|`master_allowed_remote_ips` | List of CIDR blocks allowed to initiate an API connection, `["0.0.0.0/0"]` by default |
|`bastion_allowed_ports` | List of ports to open on bastion node, `[]` by default |
|`k8s_allowed_remote_ips` | List of CIDR allowed to initiate a SSH connection, empty by default |
|`worker_allowed_ports` | List of ports to open on worker nodes, `[{ "protocol" = "tcp", "port_range_min" = 30000, "port_range_max" = 32767, "remote_ip_prefix" = "0.0.0.0/0"}]` by default |
|`master_allowed_ports` | List of ports to open on master nodes, expected format is `[{ "protocol" = "tcp", "port_range_min" = 443, "port_range_max" = 443, "remote_ip_prefix" = "0.0.0.0/0"}]`, empty by default |
@@ -283,6 +284,7 @@ For your cluster, edit `inventory/$CLUSTER/cluster.tfvars`.
|`master_server_group_policy` | Enable and use openstack nova servergroups for masters with set policy, default: "" (disabled) |
|`node_server_group_policy` | Enable and use openstack nova servergroups for nodes with set policy, default: "" (disabled) |
|`etcd_server_group_policy` | Enable and use openstack nova servergroups for etcd with set policy, default: "" (disabled) |
|`additional_server_groups` | Extra server groups to create. Set "policy" to the policy for the group, expected format is `{"new-server-group" = {"policy" = "anti-affinity"}}`, default: {} (to not create any extra groups) |
|`use_access_ip` | If 1, nodes with floating IPs will transmit internal cluster traffic via floating IPs; if 0 private IPs will be used instead. Default value is 1. |
|`port_security_enabled` | Allow to disable port security by setting this to `false`. `true` by default |
|`force_null_port_security` | Set `null` instead of `true` or `false` for `port_security`. `false` by default |
@@ -291,10 +293,32 @@ For your cluster, edit `inventory/$CLUSTER/cluster.tfvars`.
##### k8s_nodes
Allows a custom definition of worker nodes giving the operator full control over individual node flavor and
availability zone placement. To enable the use of this mode set the `number_of_k8s_nodes` and
`number_of_k8s_nodes_no_floating_ip` variables to 0. Then define your desired worker node configuration
using the `k8s_nodes`variable.
Allows a custom definition of worker nodes giving the operator full control over individual node flavor and availability zone placement.
To enable the use of this mode set the `number_of_k8s_nodes` and`number_of_k8s_nodes_no_floating_ip` variables to 0.
Then define your desired worker node configuration using the `k8s_nodes` variable.
The `az`, `flavor` and `floating_ip`parameters are mandatory.
The optional parameter `extra_groups` (a comma-delimited string) can be used to define extra inventory group memberships for specific nodes.
```yaml
k8s_nodes:
node-name:
az:string# Name of the AZ
flavor:string# Flavor ID to use
floating_ip:bool# If floating IPs should be created or not
extra_groups:string# (optional) Additional groups to add for kubespray, defaults to no groups
image_id:string# (optional) Image ID to use, defaults to var.image_id or var.image
root_volume_size_in_gb:number# (optional) Size of the block storage to use as root disk, defaults to var.node_root_volume_size_in_gb or to use volume from flavor otherwise
volume_type:string# (optional) Volume type to use, defaults to var.node_volume_type
network_id:string# (optional) Use this network_id for the node, defaults to either var.network_id or ID of var.network_name
server_group:string# (optional) Server group to add this node to. If set, this has to be one specified in additional_server_groups, defaults to use the server group specified in node_server_group_policy
cloudinit:# (optional) Options for cloud-init
extra_partitions:# List of extra partitions (other than the root partition) to setup during creation
volume_path:string# Path to the volume to create partition for (e.g. /dev/vda )
partition_path:string# Path to the partition (e.g. /dev/vda2 )
mount_path:string# Path to where the partition should be mounted
partition_start:string# Where the partition should start (e.g. 10GB ). Note, if you set the partition_start to 0 there will be no space left for the root partition
partition_end:string# Where the partition should end (e.g. 10GB or -1 for end of volume)
```
For example:
@@ -314,6 +338,7 @@ k8s_nodes = {
"az" = "sto3"
"flavor" = "83d8b44a-26a0-4f02-a981-079446926445"
"floating_ip" = true
"extra_groups" = "calico_rr"
}
}
```
@@ -424,7 +449,7 @@ This should finish fairly quickly telling you Terraform has successfully initial
You can apply cloud-init based customization for the openstack instances before provisioning your cluster.
One common template is used for all instances. Adjust the file shown below:
*`loadbalancer_plan`: Plan to use for load balancer *(development|production-small)*
*`loadbalancers`: Ports to load balance and which machines to forward to. Key of this object will be used as the name of the load balancer frontends/backends
*`port`: Port to load balance.
*`target_port`: Port to the backend servers.
*`backend_servers`: List of servers that traffic to the port should be forwarded to.
*`server_groups`: Group servers together
*`servers`: The servers that should be included in the group.
*`anti_affinity`: If anti-affinity should be enabled, try to spread the VMs out on separate nodes.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.