fix(remove-node): Ensure safety and validation for node removal process (#12085)

This commit enhances the node removal playbook's reliability and safety by implementing the following changes:

1. **Node Validation**: Added a validation step using assert to ensure the `node` variable is defined and contains nodes. If the list is empty or undefined, the playbook fails early, preventing accidental operations on the entire cluster.

2. **Removed Defaulting for Hosts**: Updated tasks to enforce explicit `node` variable input without defaulting to critical groups (e.g., `etcd:k8s_cluster:calico_rr`). By validating `node` beforehand, tasks now solely rely on user-provided input and safely avoid unintended targeting.

3. **Explicit User Confirmation**: Enhanced the confirmation prompt to clarify the scope of the operation. The admin is now required to explicitly confirm node state deletion, ensuring a deliberate decision before proceeding.

These improvements strengthen the reliability and safety of the `remove-node.yml` playbook by eliminating ambiguous behavior, preventing misconfigurations, and ensuring clear interaction during node removal tasks.
This commit is contained in:
Farshad Asadpour
2025-03-27 16:40:34 +03:30
committed by GitHub
parent 4a5b524b98
commit 1513254622
2 changed files with 15 additions and 3 deletions

View File

@@ -1,9 +1,19 @@
---
- name: Validate nodes for removal
hosts: localhost
tasks:
- name: Assert that nodes are specified for removal
assert:
that:
- node is defined
- node | length > 0
msg: "No nodes specified for removal. The `node` variable must be set explicitly."
- name: Common tasks for every playbooks
import_playbook: boilerplate.yml
- name: Confirm node removal
hosts: "{{ node | default('etcd:k8s_cluster:calico_rr') }}"
hosts: "{{ node | default('this_is_unreachable') }}"
gather_facts: false
tasks:
- name: Confirm Execution
@@ -24,7 +34,7 @@
when: reset_nodes | default(True) | bool
- name: Reset node
hosts: "{{ node | default('kube_node') }}"
hosts: "{{ node | default('this_is_unreachable') }}"
gather_facts: false
environment: "{{ proxy_disable_env }}"
pre_tasks:
@@ -40,7 +50,7 @@
# Currently cannot remove first control plane node or first etcd node
- name: Post node removal
hosts: "{{ node | default('kube_control_plane[1:]:etcd[1:]') }}"
hosts: "{{ node | default('this_is_unreachable') }}"
gather_facts: false
environment: "{{ proxy_disable_env }}"
roles: