Commit Graph

330 Commits

Author SHA1 Message Date
Vlad Korolev
9a7b021eb8 Do not use ‘yes/no’ for boolean values (#11472)
Consistent boolean values in ansible playbooks
2024-08-28 06:30:56 +01:00
Lihai Tu
8208a3f04f Rename systemd module to systemd_service (#11396)
Signed-off-by: tu1h <lihai.tu@daocloud.io>
2024-07-26 01:11:39 -07:00
Tom M.
242edd14ff Fix etcd certificate to acces address as SAN (#11388) 2024-07-25 18:49:23 -07:00
Bas
8f5f75211f Improving yamllint configuration (#11389)
Signed-off-by: Bas Meijer <bas.meijer@enexis.nl>
2024-07-25 18:42:20 -07:00
Max Gautier
d50f61eae5 pre-commit: apply autofixes hooks and fix the rest manually
- markdownlint (manual fix)
- end-of-file-fixer
- requirements-txt-fixer
- trailing-whitespace
2024-05-28 13:26:44 +02:00
Ugur Can Ozturk
a512b861e0 [etcd/tracing]: fix etcd sampling rate flag (#11175)
Signed-off-by: Ugur Ozturk <ugurozturk918@gmail.com>
2024-05-13 03:14:39 -07:00
yun
13e1f33898 Correct the POLY1305 cipher suites by adding the suffix _SHA256 (#10641) 2024-01-22 18:00:52 +01:00
Ugur Can Ozturk
ae780e6a9b [etcd]: add etcd distributed tracing flags (#10666)
* [etcd]: add etcd distributed tracing flags

Signed-off-by: Ugur Ozturk <ugurozturk918@gmail.com>

* [etcd]: add etcd distributed tracing flags - fix

Signed-off-by: Ugur Ozturk <ugurozturk918@gmail.com>

* [etcd]: add etcd distributed tracing flags - fix

Signed-off-by: Ugur Ozturk <ugurozturk918@gmail.com>

---------

Signed-off-by: Ugur Ozturk <ugurozturk918@gmail.com>
2023-12-19 04:00:10 +01:00
Max Gautier
0fb404c775 etcd: use dynamic group for certs generation check (#10610)
We take advantage of group_by to create the list of nodes needing new
certs, instead of manually looping inside a Jinja template.

This should make the role more readable and less susceptible to
white space problems.
2023-12-12 11:22:29 +01:00
Max Gautier
0d4f57aa22 Validate systemd unit files (#10597)
* Validate systemd unit files

This ensure that we fail early if we have a bad systemd unit file
(syntax error, using a version not available in the local version, etc)

* Hack to check systemd version for service files validation

factory-reset.target was introduced in system 250, same version as the
aliasing feature we need for verifying systemd services with ansible.
So we only actually executes the validation if that target is present.

This is an horrible hack which should be reverted as soon as we drop
support for distributions with systemd<250.
2023-11-17 20:01:23 +01:00
Max Gautier
8ebeb88e57 Refactor "multi" handlers to use listen (#10542)
* containerd: refactor handlers to use 'listen'

* cri-dockerd: refactor handlers to use 'listen'

* cri-o: refactor handlers to use 'listen'

* docker: refactor handlers to use 'listen'

* etcd: refactor handlers to use 'listen'

* control-plane: refactor handlers to use 'listen'

* kubeadm: refactor handlers to use 'listen'

* node: refactor handlers to use 'listen'

* preinstall: refactor handlers to use 'listen'

* calico: refactor handlers to use 'listen'

* kube-router: refactor handlers to use 'listen'

* macvlan: refactor handlers to use 'listen'
2023-11-08 12:28:30 +01:00
Max Gautier
8f0e553e11 etcd/backup: native ansible modules instead of shell (#10540)
This make native ansible features (dry-run, changed state) easier to
have, and should have a minimal performance impact, since it only runs
on the etcd members.
2023-10-30 20:05:28 +01:00
Max Gautier
0b2e5b2f82 Retries ssh connection for Gather node certs (#10515)
This allows this task to work with a forks count > 10 and the default
configuration of sshd, which is to limit sessions to 10. (see
MaxSessions in sshd_config).

Since this is a delegate_to task, it connects to the same host (first
etcd) for each node in the cluster, thus easily going above 10.

Raising the ssh connection attempts allow for more robustness, without
decreasing the forks count or serialising the tasks, which could slow
the task (or the playbook as a whole, if decreasing forks).
2023-10-19 05:04:29 +02:00
Samuel Liu
e1881fae02 Install etcdutl file by default (#10385) 2023-08-23 07:04:22 -07:00
Francisco Orselli
7295d13d60 [EOS-11830] Use ETCD port 2381 for metrics (#10332) 2023-08-08 11:06:16 -07:00
Arthur Outhenin-Chalandre
36e5d742dc Resolve ansible-lint name errors (#10253)
* project: fix ansible-lint name

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: ignore jinja template error in names

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: capitalize ansible name

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: update notify after name capitalization

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

---------

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
2023-07-26 07:36:22 -07:00
yangsenzk
13aa32278a bugfix: fix grep command without -w option causing prefix matched while adding one etcd member (#10291) 2023-07-13 21:43:29 -07:00
Arthur Outhenin-Chalandre
5d00b851ce project: fix var-spacing ansible rule (#10266)
* project: fix var-spacing ansible rule

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: fix spacing on the beginning/end of jinja template

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: fix spacing of default filter

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: fix spacing between filter arguments

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: fix double space at beginning/end of jinja

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: fix remaining jinja[spacing] ansible-lint warning

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

---------

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
2023-07-04 20:36:54 -07:00
Arthur Outhenin-Chalandre
f8f197e26b Fix outdated tag and experimental ansible-lint rules (#10254)
* project: fix outdated tag and experimental

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: remove no longer useful noqa 301

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: replace unnamed-task by name[missing]

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: fix daemon-reload -> daemon_reload

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

---------

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
2023-06-30 02:51:57 -07:00
Arthur Outhenin-Chalandre
25cb90bc2d Upgrade ansible (#10190)
* project: update all dependencies including ansible

Upgrade to ansible 7.x and ansible-core 2.14.x. There seems to be issue
with ansible 8/ansible-core 2.15 so we remain on those versions for now.
It's quite a big bump already anyway.

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* tests: install aws galaxy collection

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* ansible-lint: disable various rules after ansible upgrade

Temporarily disable a bunch of linting action following ansible upgrade.
Those should be taken care of separately.

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: resolve deprecated-module ansible-lint error

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: resolve no-free-form ansible-lint error

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: resolve schema[meta] ansible-lint error

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: resolve schema[playbook] ansible-lint error

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: resolve schema[tasks] ansible-lint error

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: resolve risky-file-permissions ansible-lint error

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: resolve risky-shell-pipe ansible-lint error

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: remove deprecated warn args

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: use fqcn for non builtin tasks

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: resolve syntax-check[missing-file] for contrib playbook

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

* project: use arithmetic inside jinja to fix ansible 6 upgrade

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>

---------

Signed-off-by: Arthur Outhenin-Chalandre <arthur.outhenin-chalandre@proton.ch>
2023-06-26 03:15:45 -07:00
Kenichi Omichi
7afbdb3e1e Drop canal network_plugin (#10100)
According to the canal github[1] the repo is not maintained over 5 years.
In addition, the README says
```
  Originally, we thought we might more deeply integrate the two projects
  (possibly even going as far as a rebranding!). However, over time it
  became clear that that wasn't really necessary to fulfil our goal of
  making them work well together. Ultimately, we decided to focus on
  adding features to both projects rather than doing work just to
  combine them.
```
So it is difficult to support canal by Kubespray at this situation.

[1]: https://github.com/projectcalico/canal
2023-05-18 03:40:33 -07:00
Kei Kori
dc33a1971d [etcd] fix make-ssl-etcd.sh.j2; move pem files only if any new certs exist (#9974) 2023-04-12 21:52:35 -07:00
Karl Fischer
6278b12af6 fixed clinet to client 2023-02-20 10:09:03 +01:00
Bas
2c93c997cf pre-commit autocorrected files (#9750) 2023-02-06 01:35:16 -08:00
ERIK
20d99886ca Update etcd log-level parameter name (#9540)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>

Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2022-12-05 01:05:03 -08:00
Samuel Liu
dd4bc5fbfe [etcd] Sometimes, we do not need to run etcd role on all nodes. (#9173)
* WIP: sometimes,we not run etcd

* fix ansible lint

* like calico(kdd) cni, no need run etcd
2022-09-09 01:29:22 -07:00
ERIK
9ad2d24ad8 Add unsafe_show_logs switch (#9164)
Signed-off-by: bo.jiang <bo.jiang@daocloud.io>

Signed-off-by: bo.jiang <bo.jiang@daocloud.io>
2022-08-16 18:52:48 -07:00
emiran-orange
2b97b661d8 Move old etcd backup removal after etcd restart (#9147) 2022-08-05 08:09:59 -07:00
Kay Yan
1d0b3829ed remove-etcd-unsupported-arch (#9049) 2022-07-04 05:39:24 -07:00
Alessio Greggi
97b4d79ed5 feat: make kubernetes owner parametrized (#8952)
* feat: make kubernetes owner parametrized

* docs: update hardening guide with configuration for CIS 1.1.19

* fix: set etcd data directory permissions to be compliant to CIS 1.1.12
2022-06-17 01:34:32 -07:00
Necatican Yıldırım
dc1af5a9c5 [etcd] Add support for setting the request size limit (#8849)
* [etcd] Add extra documentation for `etcd_memory_limit` and `etcd_quota_backend_bytes`

Signed-off-by: necatican <necaticanyildirim@gmail.com>

* [etcd] Add support for setting ETCD_MAX_REQUEST_BYTES

Signed-off-by: necatican <necaticanyildirim@gmail.com>
2022-05-23 09:36:03 -07:00
Florian Ruynat
1c0df78278 Add ETCD_EXPERIMENTAL_INITIAL_CORRUPT_CHECK flag to etcd config (#8664) 2022-03-31 08:17:01 -07:00
Mac Chaffee
512d5e3348 Restart etcd if the etcd version changes (#8556)
Signed-off-by: Mac Chaffee <me@macchaffee.com>
2022-03-11 18:08:23 -08:00
Tom Janson
ddef7e1139 missing "check_mode: no"s for several read-only tasks (#8584)
this is not complete -- there are almost certainly more instances of
this issue
2022-03-02 09:29:14 -08:00
Ilya Margolin
e053ee4272 Check all places with check_mode: no for side effects (#8573)
and fix the one with side effect.

Also removes `notify` from this task as the task has `changed_when: false`
and notify is not going to fire.
2022-02-23 01:20:18 -08:00
zhengtianbao
a16d427536 Set etcd-events listen port to 2383 (#8232) 2021-12-07 00:28:01 -08:00
Mathieu Parent
0263c649f4 Allow to scrape etcd metrics using a service (#8203)
Signed-off-by: Mathieu Parent <math.parent@gmail.com>
2021-11-17 23:53:01 -08:00
Florian Ruynat
9eacde212f Fix quorum check when recovering broken etcd cluster (#8126) 2021-10-26 15:23:09 -07:00
Iago Santos
43958614e3 Fix kubespray flatcar ansible_os_family and ansible_distribution (#8029)
Closes https://github.com/kubernetes-sigs/kubespray/issues/8028

Signed-off-by: Iago Santos <iago.santos.pardo@adfinis.com>
2021-10-01 09:11:23 -07:00
Florian Ruynat
88c11b5946 Revert "etcd: enable v2 api only if needed (#8001)" (#8008)
This reverts commit c0e1211abe.
2021-09-23 10:43:14 -07:00
Max Gautier
c0e1211abe etcd: enable v2 api only if needed (#8001)
* etcd: enable v2 api only if needed

Only enable v2 API if we have a consumer (flannel)
This reduce the exposed surface for etcd.

* Fix bad group name
2021-09-22 12:36:32 -07:00
Cristian Calin
7516fe142f Move to Ansible 3.4.0 (#7672)
* Ansible: move to Ansible 3.4.0 which uses ansible-base 2.10.10

* Docs: add a note about ansible upgrade post 2.9.x

* CI: ensure ansible is removed before ansible 3.x is installed to avoid pip failures

* Ansible: use newer ansible-lint

* Fix ansible-lint 5.0.11 found issues

* syntax issues
* risky-file-permissions
* var-naming
* role-name
* molecule tests

* Mitogen: use 0.3.0rc1 which adds support for ansible 2.10+

* Pin ansible-base to 2.10.11 to get package fix on RHEL8
2021-07-12 00:00:47 -07:00
Hari Hud
f07e24db8f Cleanup duplicate task in etcd role (#7598)
* Remove the duplicate task in etcd role

* Remove inessential delegate_to
2021-05-10 16:11:36 -07:00
Cristian Calin
360aff4a57 Rename ansible groups to use _ instead of - (#7552)
* rename ansible groups to use _ instead of -

k8s-cluster -> k8s_cluster
k8s-node -> k8s_node
calico-rr -> calico_rr
no-floating -> no_floating

Note: kube-node,k8s-cluster groups in upgrade CI
      need clean-up after v2.16 is tagged

* ensure old groups are mapped to the new ones
2021-04-29 05:20:50 -07:00
Etienne Champetier
f0cdf71ccb Remove vault (#7400)
* Remove contrib/vault

This is marked as broken since 2018 / 3dcb914607
This still reference apiserver.pem, not used since ddffdb63bf

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>

* Finish nuking vault from the codebase

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-03-24 09:26:08 -07:00
Samuel Liu
274e06a48d add etcd max snapshot and wals (#7382) 2021-03-18 16:48:36 -07:00
Etienne Champetier
de1d9df787 Only use stat get_checksum: yes when needed (#7270)
By default Ansible stat module compute checksum, list extended attributes and find mime type
To find all stat invocations that really use one of those:
git grep -F stat. | grep -vE 'stat.(islnk|exists|lnk_source|writeable)'

Signed-off-by: Etienne Champetier <e.champetier@ateme.com>
2021-02-10 05:36:59 -08:00
David Louks
aad78840a0 Updated etcd cert check tasks to detect when new cert gen is required (#7219)
* Added force_etcd_cert_refresh var to maintain existing functionality. Broke out etcd node cert syncing from member and admin cert sync logic. Now first etcd will sync node certs to other etcd members on every run to keep all etcds up to date after adding additional worker nodes to the cluster

* Updated etcd cert check tasks to better detect when new certificates need to be generated

* Move usage of force_etcd_cert_refresh var to gen_certs fact set

* Force etcd cert generation per server if force_etcd_cert_refresh is set to true

* Include gathering of node certs even if k8s-cluster member and in etcd group.

* Removed run_once due to when statement
2021-02-09 01:53:22 -08:00
Robin Elfrink
91fea7c956 Fix unintended SIGPIPEs. (#7214) 2021-01-27 01:07:40 -08:00
Samuel Liu
1a409dc7ae Add download bin tasks (#7131)
* Add downlaod bin tasks

* Add tags never and etcd

* yamllint
2021-01-22 20:41:39 -08:00