modify doc structure and update existing doc-links as preparation for new doc generation script

2025-12-14 13:54:37 +03:00 · 2024-05-15 19:32:51 +02:00
parent 0b464b5239
commit 4dbfd42f1d
82 changed files with 70 additions and 70 deletions
--- a/docs/CNI/calico.md
+++ b/docs/CNI/calico.md
@@ -0,0 +1,434 @@
+# Calico
+
+Check if the calico-node container is running
+
+```ShellSession
+docker ps | grep calico
+```
+
+The **calicoctl.sh** is wrap script with configured access credentials for command calicoctl allows to check the status of the network workloads.
+
+* Check the status of Calico nodes
+
+```ShellSession
+calicoctl.sh node status
+```
+
+* Show the configured network subnet for containers
+
+```ShellSession
+calicoctl.sh get ippool -o wide
+```
+
+* Show the workloads (ip addresses of containers and their location)
+
+```ShellSession
+calicoctl.sh get workloadEndpoint -o wide
+```
+
+and
+
+```ShellSession
+calicoctl.sh get hostEndpoint -o wide
+```
+
+## Configuration
+
+### Optional : Define datastore type
+
+The default datastore, Kubernetes API datastore is recommended for on-premises deployments, and supports only Kubernetes workloads; etcd is the best datastore for hybrid deployments.
+
+Allowed values are `kdd` (default) and `etcd`.
+
+Note: using kdd and more than 50 nodes, consider using the `typha` daemon to provide scaling.
+
+To re-define you need to edit the inventory and add a group variable `calico_datastore`
+
+```yml
+calico_datastore: kdd
+```
+
+### Optional : Define network backend
+
+In some cases you may want to define Calico network backend. Allowed values are `bird`, `vxlan` or `none`. `vxlan` is the default value.
+
+To re-define you need to edit the inventory and add a group variable `calico_network_backend`
+
+```yml
+calico_network_backend: none
+```
+
+### Optional : Define the default pool CIDRs
+
+By default, `kube_pods_subnet` is used as the IP range CIDR for the default IP Pool, and `kube_pods_subnet_ipv6` for IPv6.
+In some cases you may want to add several pools and not have them considered by Kubernetes as external (which means that they must be within or equal to the range defined in `kube_pods_subnet` and `kube_pods_subnet_ipv6` ), it starts with the default IP Pools of which IP range CIDRs can by defined in group_vars (k8s_cluster/k8s-net-calico.yml):
+
+```ShellSession
+calico_pool_cidr: 10.233.64.0/20
+calico_pool_cidr_ipv6: fd85:ee78:d8a6:8607::1:0000/112
+```
+
+### Optional : BGP Peering with border routers
+
+In some cases you may want to route the pods subnet and so NAT is not needed on the nodes.
+For instance if you have a cluster spread on different locations and you want your pods to talk each other no matter where they are located.
+The following variables need to be set as follow:
+
+```yml
+peer_with_router: true  # enable the peering with the datacenter's border router (default value: false).
+nat_outgoing: false  # (optional) NAT outgoing (default value: true).
+```
+
+And you'll need to edit the inventory and add a hostvar `local_as` by node.
+
+```ShellSession
+node1 ansible_ssh_host=95.54.0.12 local_as=xxxxxx
+```
+
+### Optional : Defining BGP peers
+
+Peers can be defined using the `peers` variable (see docs/calico_peer_example examples).
+In order to define global peers, the `peers` variable can be defined in group_vars with the "scope" attribute of each global peer set to "global".
+In order to define peers on a per node basis, the `peers` variable must be defined in hostvars.
+NB: Ansible's `hash_behaviour` is by default set to "replace", thus defining both global and per node peers would end up with having only per node peers. If having both global and per node peers defined was meant to happen, global peers would have to be defined in hostvars for each host (as well as per node peers)
+
+Since calico 3.4, Calico supports advertising Kubernetes service cluster IPs over BGP, just as it advertises pod IPs.
+This can be enabled by setting the following variable as follow in group_vars (k8s_cluster/k8s-net-calico.yml)
+
+```yml
+calico_advertise_cluster_ips: true
+```
+
+Since calico 3.10, Calico supports advertising Kubernetes service ExternalIPs over BGP in addition to cluster IPs advertising.
+This can be enabled by setting the following variable in group_vars (k8s_cluster/k8s-net-calico.yml)
+
+```yml
+calico_advertise_service_external_ips:
+- x.x.x.x/24
+- y.y.y.y/32
+```
+
+### Optional : Define global AS number
+
+Optional parameter `global_as_num` defines Calico global AS number (`/calico/bgp/v1/global/as_num` etcd key).
+It defaults to "64512".
+
+### Optional : BGP Peering with route reflectors
+
+At large scale you may want to disable full node-to-node mesh in order to
+optimize your BGP topology and improve `calico-node` containers' start times.
+
+To do so you can deploy BGP route reflectors and peer `calico-node` with them as
+recommended here:
+
+* <https://hub.docker.com/r/calico/routereflector/>
+* <https://docs.projectcalico.org/v3.1/reference/private-cloud/l3-interconnect-fabric>
+
+You need to edit your inventory and add:
+
+* `calico_rr` group with nodes in it. `calico_rr` can be combined with
+  `kube_node` and/or `kube_control_plane`. `calico_rr` group also must be a child
+   group of `k8s_cluster` group.
+* `cluster_id` by route reflector node/group (see details [here](https://hub.docker.com/r/calico/routereflector/))
+
+Here's an example of Kubespray inventory with standalone route reflectors:
+
+```ini
+[all]
+rr0 ansible_ssh_host=10.210.1.10 ip=10.210.1.10
+rr1 ansible_ssh_host=10.210.1.11 ip=10.210.1.11
+node2 ansible_ssh_host=10.210.1.12 ip=10.210.1.12
+node3 ansible_ssh_host=10.210.1.13 ip=10.210.1.13
+node4 ansible_ssh_host=10.210.1.14 ip=10.210.1.14
+node5 ansible_ssh_host=10.210.1.15 ip=10.210.1.15
+
+[kube_control_plane]
+node2
+node3
+
+[etcd]
+node2
+node3
+node4
+
+[kube_node]
+node2
+node3
+node4
+node5
+
+[k8s_cluster:children]
+kube_node
+kube_control_plane
+calico_rr
+
+[calico_rr]
+rr0
+rr1
+
+[rack0]
+rr0
+rr1
+node2
+node3
+node4
+node5
+
+[rack0:vars]
+cluster_id="1.0.0.1"
+calico_rr_id=rr1
+calico_group_id=rr1
+```
+
+The inventory above will deploy the following topology assuming that calico's
+`global_as_num` is set to `65400`:
+
+![Image](figures/kubespray-calico-rr.png?raw=true)
+
+### Optional : Define default endpoint to host action
+
+By default Calico blocks traffic from endpoints to the host itself by using an iptables DROP action. When using it in kubernetes the action has to be changed to RETURN (default in kubespray) or ACCEPT (see <https://docs.tigera.io/calico/latest/network-policy/hosts/protect-hosts#control-default-behavior-of-workload-endpoint-to-host-traffic> ) Otherwise all network packets from pods (with hostNetwork=False) to services endpoints (with hostNetwork=True) within the same node are dropped.
+
+To re-define default action please set the following variable in your inventory:
+
+```yml
+calico_endpoint_to_host_action: "ACCEPT"
+```
+
+### Optional : Define address on which Felix will respond to health requests
+
+Since Calico 3.2.0, HealthCheck default behavior changed from listening on all interfaces to just listening on localhost.
+
+To re-define health host please set the following variable in your inventory:
+
+```yml
+calico_healthhost: "0.0.0.0"
+```
+
+### Optional : Configure VXLAN hardware Offload
+
+The VXLAN Offload is disable by default. It can be configured like this to enabled it:
+
+```yml
+calico_feature_detect_override: "ChecksumOffloadBroken=false" # The vxlan offload will enabled (It may cause problem on buggy NIC driver)
+```
+
+### Optional : Configure Calico Node probe timeouts
+
+Under certain conditions a deployer may need to tune the Calico liveness and readiness probes timeout settings. These can be configured like this:
+
+```yml
+calico_node_livenessprobe_timeout: 10
+calico_node_readinessprobe_timeout: 10
+```
+
+### Optional :  Enable NAT with IPv6
+
+To allow outgoing IPv6 traffic going from pods to the Internet, enable the following:
+
+```yml
+nat_outgoing_ipv6: true  # NAT outgoing ipv6 (default value: false).
+```
+
+## Config encapsulation for cross server traffic
+
+Calico supports two types of encapsulation: [VXLAN and IP in IP](https://docs.projectcalico.org/v3.11/networking/vxlan-ipip). VXLAN is the more mature implementation and enabled by default, please check your environment if you need *IP in IP* encapsulation.
+
+*IP in IP* and *VXLAN* is mutually exclusive modes.
+
+Kubespray defaults have changed after version 2.18 from auto-enabling `ipip` mode to auto-enabling `vxlan`. This was done to facilitate wider deployment scenarios including those where vxlan acceleration is provided by the underlying network devices.
+
+If you are running your cluster with the default calico settings and are upgrading to a release post 2.18.x (i.e. 2.19 and later or `master` branch) then you have two options:
+
+* perform a manual migration to vxlan before upgrading kubespray (see migrating from IP in IP to VXLAN below)
+* pin the pre-2.19 settings in your ansible inventory (see IP in IP mode settings below)
+
+**Note:**: Vxlan in ipv6 only supported when kernel >= 3.12. So if your kernel version < 3.12, Please don't set `calico_vxlan_mode_ipv6: Always`. More details see [#Issue 6877](https://github.com/projectcalico/calico/issues/6877).
+
+### IP in IP mode
+
+To configure Ip in Ip mode you need to use the bird network backend.
+
+```yml
+calico_ipip_mode: 'Always'  # Possible values is `Always`, `CrossSubnet`, `Never`
+calico_vxlan_mode: 'Never'
+calico_network_backend: 'bird'
+```
+
+### BGP mode
+
+To enable BGP no-encapsulation mode:
+
+```yml
+calico_ipip_mode: 'Never'
+calico_vxlan_mode: 'Never'
+calico_network_backend: 'bird'
+```
+
+### Migrating from IP in IP to VXLAN
+
+If you would like to migrate from the old IP in IP with `bird` network backends default to the new VXLAN based encapsulation you need to perform this change before running an upgrade of your cluster; the `cluster.yml` and `upgrade-cluster.yml` playbooks will refuse to continue if they detect incompatible settings.
+
+Execute the following steps on one of the control plane nodes, ensure the cluster in healthy before proceeding.
+
+```shell
+calicoctl.sh patch felixconfig default -p '{"spec":{"vxlanEnabled":true}}'
+calicoctl.sh patch ippool default-pool -p '{"spec":{"ipipMode":"Never", "vxlanMode":"Always"}}'
+```
+
+**Note:** if you created multiple ippools you will need to patch all of them individually to change their encapsulation. The kubespray playbooks only handle the default ippool created by kubespray.
+
+Wait for the `vxlan.calico` interfaces to be created on all cluster nodes and traffic to be routed through it then you can disable `ipip`.
+
+```shell
+calicoctl.sh patch felixconfig default -p '{"spec":{"ipipEnabled":false}}'
+```
+
+## Configuring interface MTU
+
+This is an advanced topic and should usually not be modified unless you know exactly what you are doing. Calico is smart enough to deal with the defaults and calculate the proper MTU. If you do need to set up a custom MTU you can change `calico_veth_mtu` as follows:
+
+* If Wireguard is enabled, subtract 60 from your network MTU (i.e. 1500-60=1440)
+* If using VXLAN or BPF mode is enabled, subtract 50 from your network MTU (i.e. 1500-50=1450)
+* If using IPIP, subtract 20 from your network MTU (i.e. 1500-20=1480)
+* if not using any encapsulation, set to your network MTU (i.e. 1500 or 9000)
+
+```yaml
+calico_veth_mtu: 1440
+```
+
+## Cloud providers configuration
+
+Please refer to the official documentation, for example [GCE configuration](http://docs.projectcalico.org/v1.5/getting-started/docker/installation/gce) requires a security rule for calico ip-ip tunnels. Note, calico is always configured with ``calico_ipip_mode: Always`` if the cloud provider was defined.
+
+### Optional : Ignore kernel's RPF check setting
+
+By default the felix agent(calico-node) will abort if the Kernel RPF setting is not 'strict'. If you want Calico to ignore the Kernel setting:
+
+```yml
+calico_node_ignorelooserpf: true
+```
+
+Note that in OpenStack you must allow `ipip` traffic in your security groups,
+otherwise you will experience timeouts.
+To do this you must add a rule which allows it, for example:
+
+### Optional : Felix configuration via extraenvs of calico node
+
+Possible environment variable parameters for [configuring Felix](https://docs.projectcalico.org/reference/felix/configuration)
+
+```yml
+calico_node_extra_envs:
+    FELIX_DEVICEROUTESOURCEADDRESS: 172.17.0.1
+```
+
+```ShellSession
+neutron  security-group-rule-create  --protocol 4  --direction egress  k8s-a0tp4t
+neutron  security-group-rule-create  --protocol 4  --direction igress  k8s-a0tp4t
+```
+
+### Optional : Use Calico CNI host-local IPAM plugin
+
+Calico currently supports two types of CNI IPAM plugins, `host-local` and `calico-ipam` (default).
+
+To allow Calico to determine the subnet to use from the Kubernetes API based on the `Node.podCIDR` field, enable the following setting.
+
+```yml
+calico_ipam_host_local: true
+```
+
+Refer to Project Calico section [Using host-local IPAM](https://docs.projectcalico.org/reference/cni-plugin/configuration#using-host-local-ipam) for further information.
+
+### Optional : Disable CNI logging to disk
+
+Calico CNI plugin logs to /var/log/calico/cni/cni.log and to stderr.
+stderr of CNI plugins can be found in the logs of container runtime.
+
+You can disable Calico CNI logging to disk by setting `calico_cni_log_file_path: false`.
+
+## eBPF Support
+
+Calico supports eBPF for its data plane see [an introduction to the Calico eBPF Dataplane](https://www.projectcalico.org/introducing-the-calico-ebpf-dataplane/) for further information.
+
+Note that it is advisable to always use the latest version of Calico when using the eBPF dataplane.
+
+### Enabling eBPF support
+
+To enable the eBPF dataplane support ensure you add the following to your inventory. Note that the `kube-proxy` is incompatible with running Calico in eBPF mode and the kube-proxy should be removed from the system.
+
+```yaml
+calico_bpf_enabled: true
+```
+
+**NOTE:** there is known incompatibility in using the `kernel-kvm` kernel package on Ubuntu OSes because it is missing support for `CONFIG_NET_SCHED` which is a requirement for Calico eBPF support. When using Calico eBPF with Ubuntu ensure you run the `-generic` kernel.
+
+### Cleaning up after kube-proxy
+
+Calico node cannot clean up after kube-proxy has run in ipvs mode. If you are converting an existing cluster to eBPF you will need to ensure the `kube-proxy` DaemonSet is deleted and that ipvs rules are cleaned.
+
+To check that kube-proxy was running in ipvs mode:
+
+```ShellSession
+# ipvsadm -l
+```
+
+To clean up any ipvs leftovers:
+
+```ShellSession
+# ipvsadm -C
+```
+
+### Calico access to the kube-api
+
+Calico node, typha and kube-controllers need to be able to talk to the kubernetes API. Please reference the [Enabling eBPF Calico Docs](https://docs.projectcalico.org/maintenance/ebpf/enabling-bpf) for guidelines on how to do this.
+
+Kubespray sets up the `kubernetes-services-endpoint` configmap based on the contents of the `loadbalancer_apiserver` inventory variable documented in [HA Mode](/docs/operations/ha-mode.md).
+
+If no external loadbalancer is used, Calico eBPF can also use the localhost loadbalancer option. We are able to do so only if you use the same port for the localhost apiserver loadbalancer and the kube-apiserver. In this case Calico Automatic Host Endpoints need to be enabled to allow services like `coredns` and `metrics-server` to communicate with the kubernetes host endpoint. See [this blog post](https://www.projectcalico.org/securing-kubernetes-nodes-with-calico-automatic-host-endpoints/) on enabling automatic host endpoints.
+
+### Tunneled versus Direct Server Return
+
+By default Calico uses Tunneled service mode but it can use direct server return (DSR) in order to optimize the return path for a service.
+
+To configure DSR:
+
+```yaml
+calico_bpf_service_mode: "DSR"
+```
+
+### eBPF Logging and Troubleshooting
+
+In order to enable Calico eBPF mode logging:
+
+```yaml
+calico_bpf_log_level: "Debug"
+```
+
+To view the logs you need to use the `tc` command to read the kernel trace buffer:
+
+```ShellSession
+tc exec bpf debug
+```
+
+Please see [Calico eBPF troubleshooting guide](https://docs.projectcalico.org/maintenance/troubleshoot/troubleshoot-ebpf#ebpf-program-debug-logs).
+
+## Wireguard Encryption
+
+Calico supports using Wireguard for encryption. Please see the docs on [encrypt cluster pod traffic](https://docs.projectcalico.org/security/encrypt-cluster-pod-traffic).
+
+To enable wireguard support:
+
+```yaml
+calico_wireguard_enabled: true
+```
+
+The following OSes will require enabling the EPEL repo in order to bring in wireguard tools:
+
+* CentOS 7 & 8
+* AlmaLinux 8
+* Rocky Linux 8
+* Amazon Linux 2
+
+```yaml
+epel_enabled: true
+```
--- a/docs/CNI/cilium.md
+++ b/docs/CNI/cilium.md
@@ -0,0 +1,248 @@
+# Cilium
+
+## IP Address Management (IPAM)
+
+IP Address Management (IPAM) is responsible for the allocation and management of IP addresses used by network endpoints (container and others) managed by Cilium. The default mode is "Cluster Scope".
+
+You can set the following parameters, for example: cluster-pool, kubernetes:
+
+```yml
+cilium_ipam_mode: cluster-pool
+```
+
+### Set the cluster Pod CIDRs
+
+Cluster Pod CIDRs use the kube_pods_subnet value by default.
+If your node network is in the same range you will lose connectivity to other nodes.
+Defaults to kube_pods_subnet if not set.
+You can set the following parameters:
+
+```yml
+cilium_pool_cidr: 10.233.64.0/18
+```
+
+When cilium_enable_ipv6 is used. Defaults to kube_pods_subnet_ipv6 if not set.
+you need to set the IPV6 value:
+
+```yml
+cilium_pool_cidr_ipv6: fd85:ee78:d8a6:8607::1:0000/112
+```
+
+### Set the Pod CIDR size of a node
+
+When cilium IPAM uses the "Cluster Scope" mode, it will pre-allocate a segment of IP to each node,
+schedule the Pod to this node, and then allocate IP from here. cilium_pool_mask_size Specifies
+the size allocated from cluster Pod CIDR to node.ipam.podCIDRs.
+Defaults to kube_network_node_prefix if not set.
+
+```yml
+cilium_pool_mask_size: "24"
+```
+
+cilium_pool_mask_size Specifies the size allocated to node.ipam.podCIDRs from cluster Pod IPV6 CIDR. Defaults to kube_network_node_prefix_ipv6 if not set.
+
+```yml
+cilium_pool_mask_size_ipv6: "120"
+```
+
+## Kube-proxy replacement with Cilium
+
+Cilium can run without kube-proxy by setting `cilium_kube_proxy_replacement`
+to `strict`.
+
+Without kube-proxy, cilium needs to know the address of the kube-apiserver
+and this must be set globally for all Cilium components (agents and operators).
+We can only use the localhost apiserver loadbalancer in this mode
+whenever it uses the same port as the kube-apiserver (by default it does).
+
+## Cilium Operator
+
+Unlike some operators, Cilium Operator does not exist for installation purposes.
+> The Cilium Operator is responsible for managing duties in the cluster which should logically be handled once for the entire cluster, rather than once for each node in the cluster.
+
+### Adding custom flags to the Cilium Operator
+
+You can set additional cilium-operator container arguments using `cilium_operator_custom_args`.
+This is an advanced option, and you should only use it if you know what you are doing.
+
+Accepts an array or a string.
+
+```yml
+cilium_operator_custom_args: ["--foo=bar", "--baz=qux"]
+```
+
+or
+
+```yml
+cilium_operator_custom_args: "--foo=bar"
+```
+
+You do not need to add a custom flag to enable debugging. Instead, feel free to use the `CILIUM_DEBUG` variable.
+
+### Adding extra volumes and mounting them
+
+You can use `cilium_operator_extra_volumes` to add extra volumes to the Cilium Operator, and use `cilium_operator_extra_volume_mounts` to mount those volumes.
+This is an advanced option, and you should only use it if you know what you are doing.
+
+```yml
+cilium_operator_extra_volumes:
+  - configMap:
+      name: foo
+    name: foo-mount-path
+
+cilium_operator_extra_volume_mounts:
+  - mountPath: /tmp/foo/bar
+    name: foo-mount-path
+    readOnly: true
+```
+
+## Choose Cilium version
+
+```yml
+cilium_version: v1.12.1
+```
+
+## Add variable to config
+
+Use following variables:
+
+Example:
+
+```yml
+cilium_config_extra_vars:
+  enable-endpoint-routes: true
+```
+
+## Change Identity Allocation Mode
+
+Cilium assigns an identity for each endpoint. This identity is used to enforce basic connectivity between endpoints.
+
+Cilium currently supports two different identity allocation modes:
+
+- "crd" stores identities in kubernetes as CRDs (custom resource definition).
+  - These can be queried with `kubectl get ciliumid`
+- "kvstore" stores identities in an etcd kvstore.
+
+## Enable Transparent Encryption
+
+Cilium supports the transparent encryption of Cilium-managed host traffic and
+traffic between Cilium-managed endpoints either using IPsec or Wireguard.
+
+Wireguard option is only available in Cilium 1.10.0 and newer.
+
+### IPsec Encryption
+
+For further information, make sure to check the official [Cilium documentation.](https://docs.cilium.io/en/stable/gettingstarted/encryption-ipsec/)
+
+To enable IPsec encryption, you just need to set three variables.
+
+```yml
+cilium_encryption_enabled: true
+cilium_encryption_type: "ipsec"
+```
+
+The third variable is `cilium_ipsec_key`. You need to create a secret key string for this variable.
+Kubespray does not automate this process.
+Cilium documentation currently recommends creating a key using the following command:
+
+```shell
+echo "3 rfc4106(gcm(aes)) $(echo $(dd if=/dev/urandom count=20 bs=1 2> /dev/null | xxd -p -c 64)) 128"
+```
+
+Note that Kubespray handles secret creation. So you only need to pass the key as the `cilium_ipsec_key` variable, base64 encoded:
+
+```shell
+echo "cilium_ipsec_key: "$(echo -n "3 rfc4106(gcm(aes)) $(echo $(dd if=/dev/urandom count=20 bs=1 2> /dev/null | xxd -p -c 64)) 128" | base64 -w0)
+```
+
+### Wireguard Encryption
+
+For further information, make sure to check the official [Cilium documentation.](https://docs.cilium.io/en/stable/gettingstarted/encryption-wireguard/)
+
+To enable Wireguard encryption, you just need to set two variables.
+
+```yml
+cilium_encryption_enabled: true
+cilium_encryption_type: "wireguard"
+```
+
+Kubespray currently supports Linux distributions with Wireguard Kernel mode on Linux 5.6 and newer.
+
+## Bandwidth Manager
+
+Cilium’s bandwidth manager supports the kubernetes.io/egress-bandwidth Pod annotation.
+
+Bandwidth enforcement currently does not work in combination with L7 Cilium Network Policies.
+In case they select the Pod at egress, then the bandwidth enforcement will be disabled for those Pods.
+
+Bandwidth Manager requires a v5.1.x or more recent Linux kernel.
+
+For further information, make sure to check the official [Cilium documentation.](https://docs.cilium.io/en/v1.12/gettingstarted/bandwidth-manager/)
+
+To use this function, set the following parameters
+
+```yml
+cilium_enable_bandwidth_manager: true
+```
+
+## Install Cilium Hubble
+
+k8s-net-cilium.yml:
+
+```yml
+cilium_enable_hubble: true ## enable support hubble in cilium
+cilium_hubble_install: true ## install hubble-relay, hubble-ui
+cilium_hubble_tls_generate: true ## install hubble-certgen and generate certificates
+```
+
+To validate that Hubble UI is properly configured, set up a port forwarding for hubble-ui service:
+
+```shell script
+kubectl port-forward -n kube-system svc/hubble-ui 12000:80
+```
+
+and then open [http://localhost:12000/](http://localhost:12000/).
+
+## Hubble metrics
+
+```yml
+cilium_enable_hubble_metrics: true
+cilium_hubble_metrics:
+  - dns
+  - drop
+  - tcp
+  - flow
+  - icmp
+  - http
+```
+
+[More](https://docs.cilium.io/en/v1.9/operations/metrics/#hubble-exported-metrics)
+
+## Upgrade considerations
+
+### Rolling-restart timeouts
+
+Cilium relies on the kernel's BPF support, which is extremely fast at runtime but incurs a compilation penalty on initialization and update.
+
+As a result, the Cilium DaemonSet pods can take a significant time to start, which scales with the number of nodes and endpoints in your cluster.
+
+As part of cluster.yml, this DaemonSet is restarted, and Kubespray's [default timeouts for this operation](../roles/network_plugin/cilium/defaults/main.yml)
+are not appropriate for large clusters.
+
+This means that you will likely want to update these timeouts to a value more in-line with your cluster's number of nodes and their respective CPU performance.
+This is configured by the following values:
+
+```yaml
+# Configure how long to wait for the Cilium DaemonSet to be ready again
+cilium_rolling_restart_wait_retries_count: 30
+cilium_rolling_restart_wait_retries_delay_seconds: 10
+```
+
+The total time allowed (count * delay) should be at least `($number_of_nodes_in_cluster * $cilium_pod_start_time)` for successful rolling updates. There are no
+drawbacks to making it higher and giving yourself a time buffer to accommodate transient slowdowns.
+
+Note: To find the `$cilium_pod_start_time` for your cluster, you can simply restart a Cilium pod on a node of your choice and look at how long it takes for it
+to become ready.
+
+Note 2: The default CPU requests/limits for Cilium pods is set to a very conservative 100m:500m which will likely yield very slow startup for Cilium pods. You
+probably want to significantly increase the CPU limit specifically if short bursts of CPU from Cilium are acceptable to you.
--- a/docs/CNI/cni.md
+++ b/docs/CNI/cni.md
@@ -0,0 +1,8 @@
+CNI
+==============
+
+This network plugin only unpacks CNI plugins version `cni_version` into `/opt/cni/bin` and instructs implementation of container runtime cri to use cni.
+
+It's intended usage is for custom CNI configuration, e.g. manual routing tables + bridge + loopback CNI plugin outside kubespray scope. Furthermore, it's used for non-kubespray supported CNI plugins which you can install afterward.
+
+You are required to fill `/etc/cni/net.d` with valid CNI configuration after using kubespray.
--- a/docs/CNI/flannel.md
+++ b/docs/CNI/flannel.md
@@ -0,0 +1,51 @@
+# Flannel
+
+Flannel is a network fabric for containers, designed for Kubernetes
+
+Supported [backends](https://github.com/flannel-io/flannel/blob/master/Documentation/backends.md#wireguard): `vxlan`, `host-gw` and `wireguard`
+
+**Warning:** You may encounter this [bug](https://github.com/coreos/flannel/pull/1282) with `VXLAN` backend, while waiting on a newer Flannel version the current workaround (`ethtool --offload flannel.1 rx off tx off`) is showcase in kubespray [networking test](tests/testcases/040_check-network-adv.yml:31).
+
+## Verifying flannel install
+
+* Flannel configuration file should have been created there
+
+```ShellSession
+cat /run/flannel/subnet.env
+FLANNEL_NETWORK=10.233.0.0/18
+FLANNEL_SUBNET=10.233.16.1/24
+FLANNEL_MTU=1450
+FLANNEL_IPMASQ=false
+```
+
+* Check if the network interface has been created
+
+```ShellSession
+ip a show dev flannel.1
+4: flannel.1: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1450 qdisc noqueue state UNKNOWN group default
+    link/ether e2:f3:a7:0f:bf:cb brd ff:ff:ff:ff:ff:ff
+    inet 10.233.16.0/18 scope global flannel.1
+       valid_lft forever preferred_lft forever
+    inet6 fe80::e0f3:a7ff:fe0f:bfcb/64 scope link
+       valid_lft forever preferred_lft forever
+```
+
+* Try to run a container and check its ip address
+
+```ShellSession
+kubectl run test --image=busybox --command -- tail -f /dev/null
+replicationcontroller "test" created
+
+kubectl describe po test-34ozs | grep ^IP
+IP:                             10.233.16.2
+```
+
+```ShellSession
+kubectl exec test-34ozs -- ip a show dev eth0
+8: eth0@if9: <BROADCAST,MULTICAST,UP,LOWER_UP,M-DOWN> mtu 1450 qdisc noqueue
+    link/ether 02:42:0a:e9:2b:03 brd ff:ff:ff:ff:ff:ff
+    inet 10.233.16.2/24 scope global eth0
+       valid_lft forever preferred_lft forever
+    inet6 fe80::42:aff:fee9:2b03/64 scope link tentative flags 08
+       valid_lft forever preferred_lft forever
+```
--- a/docs/CNI/kube-ovn.md
+++ b/docs/CNI/kube-ovn.md
@@ -0,0 +1,55 @@
+# Kube-OVN
+
+Kube-OVN integrates the OVN-based Network Virtualization with Kubernetes. It offers an advanced Container Network Fabric for Enterprises.
+
+For more information please check [Kube-OVN documentation](https://github.com/alauda/kube-ovn)
+
+**Warning:** Kernel version (`cat /proc/version`) needs to be different from `3.10.0-862` or kube-ovn won't start and will print this message:
+
+```bash
+kernel version 3.10.0-862 has a nat related bug that will affect ovs function, please update to a version greater than 3.10.0-898
+```
+
+## How to use it
+
+Enable kube-ovn in `group_vars/k8s_cluster/k8s_cluster.yml`
+
+```yml
+...
+kube_network_plugin: kube-ovn
+...
+```
+
+## Verifying kube-ovn install
+
+Kube-OVN run ovn and controller in `kube-ovn` namespace
+
+* Check the status of kube-ovn pods
+
+```ShellSession
+# From the CLI
+kubectl get pod -n kube-ovn
+
+# Output
+NAME                                   READY   STATUS    RESTARTS   AGE
+kube-ovn-cni-49lsm                     1/1     Running   0          2d20h
+kube-ovn-cni-9db8f                     1/1     Running   0          2d20h
+kube-ovn-cni-wftdk                     1/1     Running   0          2d20h
+kube-ovn-controller-68d7bb48bd-7tnvg   1/1     Running   0          2d21h
+ovn-central-6675dbb7d9-d7z8m           1/1     Running   0          4d16h
+ovs-ovn-hqn8p                          1/1     Running   0          4d16h
+ovs-ovn-hvpl8                          1/1     Running   0          4d16h
+ovs-ovn-r5frh                          1/1     Running   0          4d16h
+```
+
+* Check the default and node subnet
+
+```ShellSession
+# From the CLI
+kubectl get subnet
+
+# Output
+NAME          PROTOCOL   CIDR            PRIVATE   NAT
+join          IPv4       100.64.0.0/16   false     false
+ovn-default   IPv4       10.16.0.0/16    false     true
+```
--- a/docs/CNI/kube-router.md
+++ b/docs/CNI/kube-router.md
@@ -0,0 +1,79 @@
+# Kube-router
+
+Kube-router is a L3 CNI provider, as such it will setup IPv4 routing between
+nodes to provide Pods' networks reachability.
+
+See [kube-router documentation](https://www.kube-router.io/).
+
+## Verifying kube-router install
+
+Kube-router runs its pods as a `DaemonSet` in the `kube-system` namespace:
+
+* Check the status of kube-router pods
+
+```ShellSession
+# From the CLI
+kubectl get pod --namespace=kube-system -l k8s-app=kube-router -owide
+
+# output
+NAME                READY     STATUS    RESTARTS   AGE       IP               NODE                   NOMINATED NODE
+kube-router-4f679   1/1       Running   0          2d        192.168.186.4    mykube-k8s-node-nf-2   <none>
+kube-router-5slf8   1/1       Running   0          2d        192.168.186.11   mykube-k8s-node-nf-3   <none>
+kube-router-lb6k2   1/1       Running   0          20h       192.168.186.14   mykube-k8s-node-nf-6   <none>
+kube-router-rzvrb   1/1       Running   0          20h       192.168.186.17   mykube-k8s-node-nf-4   <none>
+kube-router-v6n56   1/1       Running   0          2d        192.168.186.6    mykube-k8s-node-nf-1   <none>
+kube-router-wwhg8   1/1       Running   0          20h       192.168.186.16   mykube-k8s-node-nf-5   <none>
+kube-router-x2xs7   1/1       Running   0          2d        192.168.186.10   mykube-k8s-master-1    <none>
+```
+
+* Peek at kube-router container logs:
+
+```ShellSession
+# From the CLI
+kubectl logs --namespace=kube-system -l k8s-app=kube-router | grep Peer.Up
+
+# output
+time="2018-09-17T16:47:14Z" level=info msg="Peer Up" Key=192.168.186.6 State=BGP_FSM_OPENCONFIRM Topic=Peer
+time="2018-09-17T16:47:16Z" level=info msg="Peer Up" Key=192.168.186.11 State=BGP_FSM_OPENCONFIRM Topic=Peer
+time="2018-09-17T16:47:46Z" level=info msg="Peer Up" Key=192.168.186.10 State=BGP_FSM_OPENCONFIRM Topic=Peer
+time="2018-09-18T19:12:24Z" level=info msg="Peer Up" Key=192.168.186.14 State=BGP_FSM_OPENCONFIRM Topic=Peer
+time="2018-09-18T19:12:28Z" level=info msg="Peer Up" Key=192.168.186.17 State=BGP_FSM_OPENCONFIRM Topic=Peer
+time="2018-09-18T19:12:38Z" level=info msg="Peer Up" Key=192.168.186.16 State=BGP_FSM_OPENCONFIRM Topic=Peer
+[...]
+```
+
+## Gathering kube-router state
+
+Kube-router Pods come bundled with a "Pod Toolbox" which provides very
+useful internal state views for:
+
+* IPVS: via `ipvsadm`
+* BGP peering and routing info: via `gobgp`
+
+You need to `kubectl exec -it ...` into a kube-router container to use these, see
+<https://www.kube-router.io/docs/pod-toolbox/> for details.
+
+## Kube-router configuration
+
+You can change the default configuration by overriding `kube_router_...` variables
+(as found at `roles/network_plugin/kube-router/defaults/main.yml`),
+these are named to follow `kube-router` command-line options as per
+<https://www.kube-router.io/docs/user-guide/#try-kube-router-with-cluster-installers>.
+
+## Advanced BGP Capabilities
+
+<https://github.com/cloudnativelabs/kube-router#advanced-bgp-capabilities>
+
+If you have other networking devices or SDN systems that talk BGP, kube-router will fit in perfectly.
+From a simple full node-to-node mesh to per-node peering configurations, most routing needs can be attained.
+The configuration is Kubernetes native (annotations) just like the rest of kube-router.
+
+For more details please refer to the <https://github.com/cloudnativelabs/kube-router/blob/master/docs/bgp.md>.
+
+Next options will set up annotations for kube-router, using `kubectl annotate` command.
+
+```yml
+kube_router_annotations_master: []
+kube_router_annotations_node: []
+kube_router_annotations_all: []
+```
--- a/docs/CNI/macvlan.md
+++ b/docs/CNI/macvlan.md
@@ -0,0 +1,41 @@
+# Macvlan
+
+## How to use it
+
+* Enable macvlan in `group_vars/k8s_cluster/k8s_cluster.yml`
+
+```yml
+...
+kube_network_plugin: macvlan
+...
+```
+
+* Adjust the `macvlan_interface` in `group_vars/k8s_cluster/k8s-net-macvlan.yml` or by host in the `host.yml` file:
+
+```yml
+all:
+  hosts:
+    node1:
+      ip: 10.2.2.1
+      access_ip: 10.2.2.1
+      ansible_host: 10.2.2.1
+      macvlan_interface: ens5
+```
+
+## Issue encountered
+
+* Service DNS
+
+reply from unexpected source:
+
+add `kube_proxy_masquerade_all: true` in `group_vars/all/all.yml`
+
+* Disable nodelocaldns
+
+The nodelocal dns IP is not reacheable.
+
+Disable it in `sample/group_vars/k8s_cluster/k8s_cluster.yml`
+
+```yml
+enable_nodelocaldns: false
+```
--- a/docs/CNI/multus.md
+++ b/docs/CNI/multus.md
@@ -0,0 +1,74 @@
+# Multus
+
+Multus is a meta CNI plugin that provides multiple network interface support to
+pods. For each interface, Multus delegates CNI calls to secondary CNI plugins
+such as Calico, macvlan, etc.
+
+See [multus documentation](https://github.com/k8snetworkplumbingwg/multus-cni).
+
+## Multus installation
+
+Since Multus itself does not implement networking, it requires a master plugin, which is specified through the variable `kube_network_plugin`. To enable Multus an additional variable `kube_network_plugin_multus` must be set to `true`. For example,
+
+```yml
+kube_network_plugin: calico
+kube_network_plugin_multus: true
+```
+
+will install Multus and Calico and configure Multus to use Calico as the primary network plugin.
+
+## Using Multus
+
+Once Multus is installed, you can create CNI configurations (as a CRD objects) for additional networks, in this case a macvlan CNI configuration is defined. You may replace the config field with any valid CNI configuration where the CNI binary is available on the nodes.
+
+```ShellSession
+cat <<EOF | kubectl create -f -
+apiVersion: "k8s.cni.cncf.io/v1"
+kind: NetworkAttachmentDefinition
+metadata:
+  name: macvlan-conf
+spec:
+  config: '{
+      "cniVersion": "0.4.0",
+      "type": "macvlan",
+      "master": "eth0",
+      "mode": "bridge",
+      "ipam": {
+        "type": "host-local",
+        "subnet": "192.168.1.0/24",
+        "rangeStart": "192.168.1.200",
+        "rangeEnd": "192.168.1.216",
+        "routes": [
+          { "dst": "0.0.0.0/0" }
+        ],
+        "gateway": "192.168.1.1"
+      }
+    }'
+EOF
+```
+
+You may then create a pod with and additional interface that connects to this network using annotations. The annotation correlates to the name in the NetworkAttachmentDefinition above.
+
+```ShellSession
+cat <<EOF | kubectl create -f -
+apiVersion: v1
+kind: Pod
+metadata:
+  name: samplepod
+  annotations:
+    k8s.v1.cni.cncf.io/networks: macvlan-conf
+spec:
+  containers:
+  - name: samplepod
+    command: ["/bin/bash", "-c", "sleep 2000000000000"]
+    image: dougbtv/centos-network
+EOF
+```
+
+You may now inspect the pod and see that there is an additional interface configured:
+
+```ShellSession
+kubectl exec -it samplepod -- ip a
+```
+
+For more details on how to use Multus, please visit <https://github.com/k8snetworkplumbingwg/multus-cni>
--- a/docs/CNI/weave.md
+++ b/docs/CNI/weave.md
@@ -0,0 +1,79 @@
+# Weave
+
+Weave 2.0.1 is supported by kubespray
+
+Weave uses [**consensus**](https://www.weave.works/docs/net/latest/ipam/##consensus) mode (default mode) and [**seed**](https://www.weave.works/docs/net/latest/ipam/#seed) mode.
+
+`Consensus` mode is best to use on static size cluster and `seed` mode is best to use on dynamic size cluster
+
+Weave encryption is supported for all communication
+
+* To use Weave encryption, specify a strong password (if no password, no encryption)
+
+```ShellSession
+# In file ./inventory/sample/group_vars/k8s_cluster.yml
+weave_password: EnterPasswordHere
+```
+
+This password is used to set an environment variable inside weave container.
+
+Weave is deployed by kubespray using a daemonSet
+
+* Check the status of Weave containers
+
+```ShellSession
+# From client
+kubectl -n kube-system get pods | grep weave
+# output
+weave-net-50wd2                       2/2       Running   0          2m
+weave-net-js9rb                       2/2       Running   0          2m
+```
+
+There must be as many pods as nodes (here kubernetes have 2 nodes so there are 2 weave pods).
+
+* Check status of weave (connection,encryption ...) for each node
+
+```ShellSession
+# On nodes
+curl http://127.0.0.1:6784/status
+# output on node1
+Version: 2.0.1 (up to date; next check at 2017/08/01 13:51:34)
+
+        Service: router
+       Protocol: weave 1..2
+           Name: fa:16:3e:b3:d6:b2(node1)
+     Encryption: enabled
+  PeerDiscovery: enabled
+        Targets: 2
+    Connections: 2 (1 established, 1 failed)
+          Peers: 2 (with 2 established connections)
+ TrustedSubnets: none
+
+        Service: ipam
+         Status: ready
+          Range: 10.233.64.0/18
+  DefaultSubnet: 10.233.64.0/18
+```
+
+* Check parameters of weave for each node
+
+```ShellSession
+# On nodes
+ps -aux | grep weaver
+# output on node1 (here its use seed mode)
+root      8559  0.2  3.0 365280 62700 ?        Sl   08:25   0:00 /home/weave/weaver --name=fa:16:3e:b3:d6:b2 --port=6783 --datapath=datapath --host-root=/host --http-addr=127.0.0.1:6784 --status-addr=0.0.0.0:6782 --docker-api= --no-dns --db-prefix=/weavedb/weave-net --ipalloc-range=10.233.64.0/18 --nickname=node1 --ipalloc-init seed=fa:16:3e:b3:d6:b2,fa:16:3e:f0:50:53 --conn-limit=30 --expect-npc 192.168.208.28 192.168.208.19
+```
+
+## Consensus mode (default mode)
+
+This mode is best to use on static size cluster
+
+### Seed mode
+
+This mode is best to use on dynamic size cluster
+
+The seed mode also allows multi-clouds and hybrid on-premise/cloud clusters deployment.
+
+* Switch from consensus mode to seed/Observation mode
+
+See [weave ipam documentation](https://www.weave.works/docs/net/latest/tasks/ipam/ipam/) and use `weave_extra_args` to enable.