* etcd: etcd-events doesn't depend on etcd_cluster_setup
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* etcd: remove condition already present on include_tasks
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* etcd: fix scaling up
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* etcd: use *access_addresses, do not delegate to etcd[0]
We want to wait for the full cluster to be healthy,
so use all the cluster addresses
Also we should be able to run the playbook when etcd[0] is down
(not tested), so do not delegate to etcd[0]
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* etcd: use failed_when for health check
unhealthy cluster is expected on first run, so use failed_when
instead of ignore_errors to remove scary red messages
Also use run_once
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* kubernetes/preinstall: ensure ansible_fqdn is up to date after changing /etc/hosts
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* kubernetes/master: regenerate apiserver cert if needed
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
(cherry picked from commit a35b6dc1af)
* containerd: add proxy support
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* kubespray-defaults: add kube_service_addresses / kube_pods_subnet to no_proxy
CIDR notation in no_proxy is supported by a lot of programs/languages,
including go: https://github.com/golang/go/issues/16704
Without that containerd cannot talk the the API server (kube_apiserver_ip),
but it should not go through an external proxy for the nodes/pods/services
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
(cherry picked from commit 9f2dd09628)
* Backport Kubernetes 1.16.8 (#5770)
* Kubernetes 1.16.8
* Upgrade etcd to 3.3.12 (#5718)
* Use kubespray 2.11.2 as start version for the upgrade test case
* Remove dockerproject org (#5548)
* Change dockerproject.org to download.docker.com
dockerproject.org was deprecated in 2017 and has gone down.
* Restore yum repo for containerd
Change-Id: I883bb512a2164a85865b1bd4fb569af0358c8c2b
Co-authored-by: Craig Rodrigues <rodrigc@crodrigues.org>
* remove legacy docker repo in kubernetes/preinstall before any packages installed (#5640)
* Remove dockerproject_.+_repo_.+ variables (#5662)
This 38688a4486 change replaces the
value for dockerproject_.+_repo_.+ docker variables but their new
value was previously defined in other variables. This change removes
the dockerproject_.+_repo_.+ docker variables in favor of the older
ones.
* Remove stale legacy yum docker repo /etc/yum.repos.d/docker.repo (#5569)
* Remove stale legacy yum docker repo /etc/yum.repos.d/docker.repo
* move task 'Remove legacy docker repo file' to pre-upgrade.yml
* fix upgrade procedure when in playbook (#5695)
exists role kubernetes/preinstall and not exists role container-engine
error 'yum_repo_dir' is undefined
Co-authored-by: Matthew Mosesohn <matthew.mosesohn@gmail.com>
Co-authored-by: Craig Rodrigues <rodrigc@crodrigues.org>
Co-authored-by: Victor Morales <chipahuac@hotmail.com>
When running with serial != 100%, like upgrade_cluster.yml, we need to apply this fixup each time
Problem was introduced in 05dc2b3a09
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
(cherry picked from commit 5e9479cded)
* Fix python3-libselinux installation for RHEL/CentOS 8
In bootstrap-centos.yml we haven't gathered the facts,
so #5127 couldn't work
Minimum ansible version to run kubespray is 2.7.8,
so ansible_distribution_major_version is defined an there is no need to default it
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* Restart NetworkManager for RHEL/CentOS 8
network.service doesn't exist anymore
# systemctl status network
Unit network.service could not be found.
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* Add module_hotfixes=True to docker / containerd yum repo config
https://bugzilla.redhat.com/show_bug.cgi?id=1734081https://bugzilla.redhat.com/show_bug.cgi?id=1756473
Without this setting you end up with the following error:
# yum install docker-ce
Failed to set locale, defaulting to C
Last metadata expiration check: 0:03:21 ago on Thu Sep 26 22:00:05 2019.
Error:
Problem: package docker-ce-3:19.03.2-3.el7.x86_64 requires containerd.io >= 1.2.2-3, but none of the providers can be installed
- cannot install the best candidate for the job
- package containerd.io-1.2.2-3.3.el7.x86_64 is excluded
- package containerd.io-1.2.2-3.el7.x86_64 is excluded
- package containerd.io-1.2.4-3.1.el7.x86_64 is excluded
- package containerd.io-1.2.5-3.1.el7.x86_64 is excluded
- package containerd.io-1.2.6-3.3.el7.x86_64 is excluded
(try to add '--skip-broken' to skip uninstallable packages or '--nobest' to use not only best candidate packages)
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* add support for nova servergroups
* Add documentation for openstack nova servergroups
* uppdate to TF 0.12.12 format and fix etcd
* revert for_each change
* fix variables and formatting in main.tf
* try to avoid errors
* update variable
* Update main.tf
* Update main.tf
* update all other instance resources
Initially this was to fix a mis-indented approvers key. However, it turns
out that 'oilbeater' is not a member of kubernetes-sigs nor
kubernetes-incubator (the org this repo was migrated from). Thus this
OWNERS file is failing prow's validation check.
As a workaround I've opted to move them to emeritus_approver, which
isn't valiated and can be used as a hint for other approvers in this
repo
This fixes the scenario where masters are upgraded one at a time
and coredns gets improperly scaled back up to 2 replicas.
Change-Id: I7cc9283f40efcfd61b5813c89a5805c95d901567
* Update parsing of terraform state file for 0.12.12
* Resource does not seem to have a module element but instead has
provider
* Return the boolean right way if it is already a bool since a bool does
not have an lower method
* Remove the setting of ansible_ssh_user to root for all Packet
Not all servers in packet are accessed as root by default. CoreOS
systems use the `core` user. Removing this allows the user to specify
the remote user with an extra_var or in an ansible.cfg file.
* Default to root user for packet devices except on CoreOS
* Update TF_VERSION for packet in tf-validate-packet
Update TV_VERSION to 0.12.12 for gitlab-ci tf-validate-packet tests
* convert packet terraform files to TV_VERSION 4
* initalize terraform before copying the variable file to the top level dir
Kubespray Pull Request #5084 (https://github.com/kubernetes-sigs/kubespray/pull/5084) caused more problems than it solved due to limitations with the synchronize module. See comments on Kubespray Issues #5059 (https://github.com/kubernetes-sigs/kubespray/issues/5059) and #5116 (https://github.com/kubernetes-sigs/kubespray/issues/5116). Details from Ansible documentation: "Currently, synchronize is limited to elevating permissions via passwordless sudo. This is because rsync itself is connecting to the remote machine and rsync doesn’t give us a way to pass sudo credentials in. ... Currently there are only a few connection types which support synchronize (ssh, paramiko, local, and docker) because a sync strategy has been determined for those connection types. Note that the connection for these must not need a password as rsync itself is making the connection and rsync does not provide us a way to pass a password to the connection. ..." Thus, reverting Pull Request #5084.
* Add support for Kubernetes 1.16.1
* Defaults to 1.16.1
* add 1.16.2 checksums and set new version as default
* correct 1.16.2 checksums and add 1.15.5 checksums
Since it is unsupported to skip upgrades, I've detailed the steps for upgrading a step at a time and removed some language that indicated it should work
When using cluster.yml or scale.yml to add/scale nodes in the existing
k8s cluster, the `kubeadm init` wouldn't run. As a result, kube-proxy
wouldn't be created, and therefore the kube-proxy deletion task would
fail, e.g. in the case where kube-router is used and "kube_proxy_remove"
is set to true. As a workaround, add ignore_errors to the kube-proxy
deletion task.
It's unnecessary and breaks when running from within a docker container:
```
An exception occurred during task execution. To see the full traceback, use -vvv. The error was: TimeoutError: Timer expired after 10 seconds
fatal: [localhost]: FAILED! => {"changed": false, "cmd": "/usr/sbin/udevadm info --query property --name /dev/mapper/vg00-root", "msg": "Timer expired after 10 seconds", "rc": 257}
```
The script is not usable unless you are in the '.vagrant/provisioners/ansible/inventory/artifacts' folder.
This update makes this usable from anywhere.
- do not run etcd role when etcd_kubeadm_enabled == true
- remove default value 'systemd' for cgroup driver in containerd role.
this value override autodetect in kubelet_cgroup_driver_detected from docker info
This allows to easily override the gcr, quay, and docker repos with the
mirror repos in countries like China, where the default accesses are
blocked or unstable.
mydict.keys() should be converted to list,
otherwise it causes errors in loop iteration.
Remove extra space after class name, which broke configmap.
Also allow set reclaimPolicy property.
Cleaned up deprecated APIs:
apps/v1beta1
apps/v1beta2
extensions/v1beta1 for ds,deploy,rs
Add workaround for deploying helm using incompatible
deployment manifest.
Change-Id: I78b36741348f47a999df3841ee63cf4e6f377830
* Use python3-libselinux on RHEL8/Centos8
* The fact ansible_facts.distribution_major_version is not present on older Ansible version.
Default it to 0 in when not present and use libselinux-python as package to get current
default behaviour.
Fix for Kubespray Issue #5059 (https://github.com/kubernetes-sigs/kubespray/issues/5059). There is a known issue with the 'fetch' module that will sometimes lead to it failing with a memory error. See ansible/ansible#11702 (https://github.com/ansible/ansible/issues/11702). I encountered this issue with the "Copy kubectl binary to ansible host" task in kubespray/roles/kubernetes/client/tasks/main.yml, and it caused my entire deployment to error out (see "Output of ansible run" above). Replacing 'fetch' with 'synchronize' fixes this issue.
Updated Openstack to terraform 0.12 (#5062)
* update openstack to terraform 0.12(.5)
* replace cluter.tf with cluster.tfvars
* update README.md to terraform 0.12
* update Openstack CI tests to use terraform 0.12
* specify terraform version in openstack README
* gitlab CI to copy cluster.tfvars in case of openstack provider
* The terraform/openstack dynamic inventory can read
tfstate v4 (generated by terraform 0.12) and convert them internally
ro v3 (as generated by terraform 0.11.x).
Additionally the script has been updated to Python 3.
* run 'task download_container | Copy image to ansible host cache' with synchronize on download_delegate host
* try to run task copy file to ansible host on all inventory, not only on first random host
* update openstack to terraform 0.12(.5)
* replace cluter.tf with cluster.tfvars
* update README.md to terraform 0.12
* update Openstack CI tests to use terraform 0.12
* specify terraform version in openstack README
* gitlab CI to copy cluster.tfvars in case of openstack provider
* The terraform/openstack dynamic inventory can read
tfstate v4 (generated by terraform 0.12) and convert them internally
ro v3 (as generated by terraform 0.11.x).
Additionally the script has been updated to Python 3.
Fixes situation when using manual mode because it
tries to download coredns v1.3.1 from the same
image repository where kubernetes images are
downloaded from.
Change-Id: Ibbec8a72c8162ce8befa74e2013a268737ea5f8a
* Refactor calico-rr to run in k8s cluster with taint
Change-Id: I75a3169ff5b36ce8302fc7ef1c32d3eb697b5afa
* add preinstall checks
* rework calico/rr role
Change-Id: I2f0a7e6cb77cf91ad4a615923680760d2e5d9ca8
* add empty calico-rr group
Change-Id: I006c0a60db9b72d02245bf8fdfabcf982144a5ad
* lvm packages removal during tear down skipped by default
* lvm utils execution PATH fixed for CentOS/RH
* Heketi updated to the latest version 9
Signed-off-by: Vitaliy Dmitriev <vi7alya@gmail.com>
* Let Premoderator script add labels
* Fix JQ error
* Minor fixes
* Debug patch label output
* Try again
* Try again
* Try again
* Try again
* Try again
* Minor cleanup
* Enable nodes to run calicoctl
per-node tasks require waiting for calico-node to be applied
Change-Id: Ibe1076b7334a2da0332f2dd766fde0c3f172d1f2
* cleanup tasks that should run on master
Change-Id: I43a837879ef41596f14657ecd7f813899b6865ae
* Switch run_once calico logic to just run on first master
Change-Id: I6893711e354f63c5e1eaf6ac2e23d9a6347a555d
Update README.md to link to the open issue that shows Ansible 2.8.x doesn't work with Kubespray. The requirements.txt file is already fixed to 2.7.8 so only the README needed updating, I think.
* Enable containerd to deploy vanilla containerd package
Fixes kubeadm references to CRI socket for containerd
Fixes download role cache feature to work with containerd
Change-Id: I2ab8f0031107e2f0d1a85c39b4beb66f08509a01
* use containerd for flannel-addons job
Change-Id: Ied375c7d65e64a625ffbd995ff16f2374067dee6
* add containerd vars
Change-Id: Ib9a8a04e501c481a86235413cbec63f3672baf91
* fixup vars
Change-Id: Ibea64e4b18405a578b52a13da100384582aa24c2
* more fixes
* fix rh repo
Change-Id: I00575a77cfb7b81d6095db5d918a52023c8f13ba
* Adjust helm host install for containerd
* Add calico 3.7.3 support
* add calico_datastore variable to policy controller role
* add missing clusterrole rules for calico policy controller
* disable calico kube controller when kdd mode is used for versions < 3.6
8080 is a pretty common port, using nodelocaldns_ip:8080 still
prevents node processes or hostNetwork=true processes to bind to *:8080
so switch to 9254 by default (prometheus port is 9253)
Signed-off-by: Etienne Champetier <champetier.etienne@gmail.com>
* Use K8s 1.15
* Use Kubernetes 1.15 and use kubeadm.k8s.io/v1beta2 for
InitConfiguration.
* bump to v1.15.0
* Remove k8s 1.13 checksums.
* Update README kubernetes version 1.15.0.
* Update metrics server 0.3.3 for k8s 1.15
* Remove less than k8s 1.14 related code
* Use kubeadm with --upload-certs instead of --experimental-upload-certs due to depricate
* Update dnsautoscaler 1.6.0
* Skip certificateKey if it's not defined
* Add kubeadm-conftolplane.v2beta2 for k8s 1.15 or later
* Support kubeadm control plane for k8s 1.15
* Update sonobuoy version 0.15.0 for k8s 1.15
* Add limited containerd support
Containerd support for Ubuntu + Calico
* Added CRI-O support for ubuntu
* containerd support.
* Reset containerd support.
* fix lint.
* implemented feedback
* Change task name cri xx instead of cri-o in reset task and timeout condition.
* set crictl to fixed version
* Use docker-ce's container.io package for containerd.
* Add check containerd is installable or not.
* Avoid stop docker when use containerd and optimize retry for reset.
* Add config.toml.
* Fixed containerd for kubelet.env.
* Merge PR #4629
* Remove unused ubuntu variable for containerd
* Polish code for containerd and cri-o
* Refactoring cri socket configuration.
* Configurable conmon.
* Remove unused crictl/runc download
* Now crictl and runc is downloaded by common crictl.yml.
* fixed yamllint error
* Fixed brokenfiles by conflict.
* Remove commented line in config.toml
* Remove readded v1.12.x version
* Fixed broken set_docker_image_facts
* Fix yamllint errors.
* Remove unused apt source
* Fix crictl could not be installed
* Add containerd config from skolekonov's PR #4601
* add macvlan cni to kubespray
* macvlan: lint yaml files and fix sample config file
* macvlan: add OWNERS file
* add macvlan to README
* macvlan : CI first shoot
* macvlan : CI add full masquerade
* delegate retrive pod cidr to master only
* macvlan: add config for CI
* macvlan: add netchecker deployment
kubernetes/master role defines this value as an empty string
when using a cloud provider, not undefined. The check was updated
accordingly.
Change-Id: I58dc31ef4fd568a717a6753eb89ca687933018ae
* Require minimum version of Kubernetes
* Remove checksums for kubernetes version 1.12
* Add kube_version to precheck output and add min required version to README
* Fix merge
* Fix defaults
* Fix typo in precheck
* File and container image downloads are now cached localy, so that repeated vagrant up/down runs do not trigger downloading of those files. This is especially useful on laptops with kubernetes runnig locally on vm's. The total size of the cache, after an ansible run, is currently around 800MB, so bandwidth (=time) savings can be quite significant.
* When download_run_once is false, the default is still not to cache, but setting download_force_cache will still enable caching.
* The local cache location can be set with download_cache_dir and defaults to /tmp/kubernetes_cache
* A local docker instance is no longer required to cache docker images; Images are cached to file. A local docker instance is still required, though, if you wish to download images on localhost.
* Fixed a FIXME, wher the argument was that delegate_to doesn't play nice with omit. That is a correct observation and the fix is to use default(inventory_host) instead of default(omit). See ansible/ansible#26009
* Removed "Register docker images info" task from download_container and set_docker_image_facts because it was faulty and unused.
* Removed redundant when:download.{container,enabled,run_once} conditions from {sync,download}_container.yml
* All features of commit d6fd0d2aca by Timoses <timosesu@gmail.com>, merged May 1st 2019, are included in this patch. Not all code was included verbatim, but each feature of that commit was checked to be working in this patch. One notable change: The actual downloading of the kubeadm images was moved to {download,sync)_container, to enable caching.
Note 1: I considered splitting this patch, but most changes that are not directly related to caching, are a pleasant by-product of implementing the caching code, so splitting would be impractical.
Note 2: I have my doubts about the usefulness of the upload, download and upgrade tags in the download role. Must they remain or can they be removed? If anybody knows, then please speak up.
* Make local volume provisioner dir mode a variable
I need to change this for Nagios monitoring. Others may
need to as well. Had to close previous commits, sorry for
the spam.
* Make local volume provisioner dir mode a variable
I need to change this for Nagios monitoring. Others may
need to as well. Had to close previous commits, sorry for
the spam.
Task "kube-roter | Set cni directory permissions"
sets ownership of /opt/cni/bin to "kube"
Task "kube-router | Copy cni plugins"
copies the binaries from the archive setting the ownership
back to "root"
Fix "kube-roter" typo
Signed-off-by: Alberto Murillo <albertomurillosilva@gmail.com>
* Add support for arm images for hyperkube, kubeadm and cni_binary
* Add dummy etcd checksum for arm
This commit adds dummy etcd checksum for arm to avoid "no attribute" error
during setup.
* Add etcd host assert check
* Add 1.13.4 checksums of kubeadm and hyperkube for arm
* Update checksums of kubeadm and hyperkube for arm
* Add dummy checksums for calicoctl_binary_checksums dict
* disable gather_facts because it causes tests to fail
* Remove architecture check for etcd, due to unable to run tests
Long option --become was used in the example but in the comment describing it the short option -b was used.
Use same option in description and example to avoid confusion.
* Added pod psp in Rancher Local Path Provisioner
Added pod security policy (psp) in Rancher Local Path Provisioner.
Signed-off-by: André R. de Miranda <andre@miranda.work>
* Apply psp for Rancher Local Path Provisioner only when local_path_provisioner_namespace is not kube-system and also reorganized the templates
Kubespray waits exit of every drain before run other one.
Running drain every after each other seems better than parallel, because we should check resources availability every time.
But, this way, we have one additional problem: possible restart pods on the nodes that are killed little bit later.
Fast cordon before heavy drain seems like an easy solution.
Error starting nginx because in requiredDropCapabilities is dropped all capabilities.
The nginx requires the following capabilities:
- CHOWN
- SETGID
- SETUID
Signed-off-by: André R. de Miranda <andre@miranda.work>
* updated ansible pinning to prevent more possibilities of breaking changes
* more exact pinning of ansible version
* more exact pinning of ansible version and also all the rest
* added testing requirements.txt pinning settings
* removed boto from testing requirements.txt
Without this, pulls are considered for all
hosts groups, even if not targetted by the downloads
`groups` list. Hence, a download/sync is triggered
even though the host does not require the image.
* Disable kube_api_anonymous_auth by default to secure the setup
* Disable metrics-server in addons. Health endpoint is slow and unstable
* Fix anonymous-auth missing in configuration
* Cleanup a bit
* Fix kube anon auth
* Download to delegate and sync files when download_run_once
* Fail on error after saving container image
* Do not set changed status when downloaded container was up to date
* Only sync containers when they are actually required
Previously, non-required images (pull_required=false as
image existed on target host) were synced to the target
hosts. This failed as the image was not downloaded to
the download_delegate and hence was not available for
syncing.
* Sync containers when only missing on some hosts
* Consider images with multiple repo tags
* Enable kubeadm images pull/syncing with download_delegate
* Use kubeadm images list to pull/sync
'kubeadm config images pull' is replaced by collecting the images
list with 'kubeadm config images list' and using the commonly
used method of pull/syncing the images.
* Ensure containers are downloaded and synced for all hosts
* Fix download/syncing when download_delegate is a kubernetes host
When docker_container_storage_setup is false,
docker service should not depend on docker-storage-setup service,
because it's not installed.
For example, when using overlay2 on recent RHEL 7/Centos 7 kernels,
you most likely don't need it.
* Fix nodeselectors for contiv and nginx-ingress
Change-Id: Ib3eb6bd87193c69a90ee944c9164a0b6792c79ba
* Set kube proxy mode to iptables for addons task
Change-Id: Iff71a71f672405c74b4708c71db15ddc4391a53a
Cilium requires a high kernel. rhel7 and centos7 are too low, so they are removed.
Bumping ubuntu to ubuntu-1804
Change-Id: Ib1bffa45b8f9ed0ba500f751714372b3a3f7878b
* Mark "Calico | Set global as_num" as "unchanged"
This command executes with "--skip-exists" parameter, so it is idempotent
and should not be marked as "changed".
* trigger ci
add the support of the folling property in azure-credential-check.yml
- azure_loadbalancer_sku: Sku of Load Balancer and Public IP. Candidate values are: basic and standard.
- azure_exclude_master_from_standard_lb: excludes master nodes from standard load balancer.
- azure_disable_outbound_snat: disables the outbound SNAT for public load balancer rules
- useInstanceMetadata: Use instance metadata service where possible
- azure_primary_availability_set: (Optional) The name of the availability set that should be used as the load balancer backend
We don't need to support upgrades from 2 year old installs,
just from the last major version.
Also changed most retried tasks to 1s delay instead of longer.
* Add README to bootstrap-os role
* Rework bootstrap-os once more
* Document workarounds for bugs/deficiencies in Ansible modules
* Unify and document role variables
* Remove installation of additional packages and repositories
* Merge Ubuntu and Debian tasks
* Remove pipelining setting from default playbooks
* Fix OpenSUSE not running its required tasks
Ansible 2.0 has deprecated the “ssh” from ansible_ssh_host.
Updating the docs to be more aligned with the Ansible version used in the sample/inventory.ini file as well.
Also adding `[bastion]` group in the docs to avoid confusion.
The docker service provided by the containers-basic bundle is masked
in ClearLinux distribution. This is causing errors in the following
steps. This commit ensures that the unit is not masked.
* Use K8s 1.14 and add kubeadm experimental control plane mode
This reverts commit d39c273d96.
* Cleanup kubeadm setup run on first master
* pin kubeadm_certificate_key in test
* Remove kubelet autolabel of kube-node, add symlink for pki dir
Change-Id: Id5e74dd667c60675dbfe4193b0bc9fb44380e1ca
The BINDIR variable defined on the runc's Makefile[1] defines
installation path is on $(PREFIX)/sbin which used for most of the
Linux distributions. This change fixes the absolute path used for
non-ClearLinux distributions (CentOS, Ubuntu).
[1] https://github.com/opencontainers/runc/blob/master/Makefile#L10
The Stateless ClearLinux feature[1] requires the creation of folders
in /etc folder. This change ensure the existence of the
/etc/bash_completion.d/ folder for ClearLinux Distribution.
[1] https://clearlinux.org/features/stateless
* PHASE 2 - Enable Packet-CI in gitlab
* Add gitlab files
* Reset files back and only keep Packet
* Include packet
* Add missing Upgrade Tests
* Update GCE jobs etc
* Fix bug
* Yaml lint all gitlab files
* Remove GCE
* Test
* Test again
* Enable GCE again
* Install requirements
* Cleanup the gitlab file
* Cleanup runner tags
* Install requirements
* Test
* Test variables for gce
* Test again
* Test again
* Fix
* Update
* Add k8s_allowed_remote_ips variable
Useful for defining CIDRs allowed to initiate a SSH connection when
you don't want to use a bastion.
* Add TF_VAR_k8s_allowed_remote_ips variable to tf-apply-ovh
* Enable nodelocaldns by default
* Enable nodelocaldns by default
* nodelocaldns is now default
* Disable enable_nodelocaldns for the addons CI jobs
Disable enable_nodelocaldns for the addons CI jobs to make sure things still work without nodelocaldns
* Add ansible-lint as gitlab-ci step
* Fix jinja2 syntax in include_tasks that breaks ansible-lint
* Use a block scalar to get around gitlab quoting/escaping rules
* Run ansible-lint in verbose mode in CI
* Update CI to use 2.9.0 release and update Dockerfile to now use 18.04
* Update CI to use 2.9.0 release and update Dockerfile to now use 18.04
* Update the kubectl bin
This will fix error: error converting YAML to JSON: yaml: line 36: mapping values are not allowed in this context
Signed-off-by: Abdulaziz AlMalki <almalki.a@gmail.com>
* Vagrantfile: Bump openSUSE to Leap 15.0
* roles: container-engine: Add 'containerd' package for openSUSE
The 'containerd' package contains the docker-containerd and
docker-containerd-shim binaries. We also need to ensure that the latest
version is installed since an older version may already be present (eg GCE
images)
* Remove docker log-opts for opensuse
* roles: bootstrap-os: Use lowercase 'o' for openSUSE
OpenSUSE is not a valid family name. The correct one is openSUSE
* roles: bootstrap-os: Update zypper cache before first installation
The zypper cache may be outdated so ensure that it's fully updated
before we try and install the bootstrap packages.
* Run terraform fmt
* Add terraform fmt to .terraform-validate CI step
* Add tf-validate-aws CI step
* Revert "Add tf-validate-aws CI step"
This reverts commit e007225fac.
The .terraform directory is populated when modules are downloaded:
https://www.terraform.io/docs/commands/get.html
"The modules are downloaded into a local .terraform folder. This folder should not be committed to version control."
Both the `yum` and `apt` modules support a list as input, this allows us avoid the slower `with_items` approach, which can take a long time with a large count of cluster nodes.
Both kubedns and dnsmasq modes are long not maintained.
We should run dns_late steps at the end because sshd
makes DNS lookups during Ansible run and has 2s timeouts
for each failed lookup trying to connect to coredns before
it is ready.
* Lint everything in the repository with yamllint
* yamllint fixes: syntax fixes only
* yamllint fixes: move comments to play names
* yamllint fixes: indent comments in .gitlab-ci.yml file
values from inventory in roles/kubespray-defaults/defaults/main.yml
hardcoded values in roles/container-engine/defaults/main.yml
dns_servers set empty in roles/container-engine/defaults/main.yml and skydns_server not set in docker_dns_servers variables
also set default value for manual_dns_serve
another variables in roles/container-engine/defaults not need to set
* feat(external-provisioner/local-path-provisioner): adds support for local path provisioner
Helpful for local development but also in production workloads (once the
permission model is worked out) where you have redundancy built into the
software uses the PVCs (e.g. database cluster with synchronous
replication)
* feat(local-path-provisioner): adds debug flag, image tag group var
* fix(local-path-provisioner): moves image repo/tag to download role
* test(gce_centos7-flannel): enables local-path-provisioner in test case
* fix(addons): add image repo/tag to commented default values
* fix(local-path-provisioner): typo in jinja template for local path provisioner
* style(local-path-provisioner): debug flag condition re-formatted
* fix(local-path-provisioner): adds missing default value for debug flag
* fix(local-path-provisioner): syntax fix for debug if condition end
* fix(local-path-provisioner): jinja template syntax: if condition white space
This was already approved in #4106 but there are CI issues
with that PR due to references to kubernetes incubator.
After upgrading to Kubespray 2.8.1 with Kubeadm enabled Rook
Ceph volume provision failed due to the flexvolume plugin dir not
being correct. Adding the var fixed the issue
* Adding ability to maintain existing Encryption Secrets at Rest.
If secrets_encryption.yaml is present it will not be overriten with a new kube_encrypt_token.
This should allow for it to be set ahead of a playbook running or maintain it if cluster.yml is ran on the same cluster and the ansible host does not have access to the secrets.
* Setting existing kube_encrypt_token across all master nodes in case it was missing in one or more nodes.
This fixes an issue where the `nodename` in calico's cni config json can fall out of sync with the k8s node name used by the calico pod if `kube_override_hostname` is set
Currently, the task `container_download | download images for kubeadm config images` fetches etcd image even though it's not required (etcd is bootstrapped by kubespray, not kubeadm).
`kubeadm-images.yaml` is only a subset of `kubeadm-config.yaml`, therefore ``kubeadm config images pull` will try to get all this list (including etcd)
```
# kubeadm config images list --config /etc/kubernetes/kubeadm-images.yaml
k8s.gcr.io/kube-apiserver:v1.13.2
k8s.gcr.io/kube-controller-manager:v1.13.2
k8s.gcr.io/kube-scheduler:v1.13.2
k8s.gcr.io/kube-proxy:v1.13.2
k8s.gcr.io/pause:3.1
k8s.gcr.io/etcd:3.2.24
k8s.gcr.io/coredns:1.2.6
```
When using the `kubeadm-config.yaml` though, it doesn't list etcd image:
```
# kubeadm config images list --config /etc/kubernetes/kubeadm-config.yaml
k8s.gcr.io/kube-apiserver:v1.13.2
k8s.gcr.io/kube-controller-manager:v1.13.2
k8s.gcr.io/kube-scheduler:v1.13.2
k8s.gcr.io/kube-proxy:v1.13.2
k8s.gcr.io/pause:3.1
k8s.gcr.io/coredns:1.2.6
```
This change just adds the etcd endpoints in the `kubeadm-images.yaml` to give a hint to kubeadm it doesn't need etcd image for its boostrapping as etcd is "external".
I confess it is a ugly hack, a better way would be to use a single `kubeadm-config.yaml` for both tasks, but they are triggered by different roles (`kubeadm-images.yaml` is used by download, `kubeadm-config.yaml` by kubernetes/master) at different steps and I didn't want to refactor too many things to prevent breakage.
This is specially useful for offline installation where a whitelist of container images is mirrored on a local private container registry. `k8s.gcr.io/etcd` and `quay.io/coreos/etcd` are two different repositories hosting the same images but using *different tags*!
* coreos/etcd:v3.2.24
* k8s.gcr.io/etcd:3.2.24 (note the missing 'v' in the tag name)
Fix issue where `kubeadm join` could wait forever for joining.
Fix issue where `kubeadm join` were not reaching the user, making
impossible to find the cause of the failure.
New behaviour is to first attempt to join without bypassing the
verifications checks and to display them if needed.
If this fails it still attempts to join by ignoring the check in
order to make previous behavior.
A timeout of 60 seconds is allocated for a joining.
Related-bug: #3973
* bootstrap: rework role
* support being called from a non-root user
* run some commands in check mode
* unify spelling/task names
* bootstrap: fix wording of comments for check_mode: false
* bootstrap: remove setup-pipelining task
* OCI subnet AD 2 is not required for CCM >= 0.7.0
Reorganize OCI provider to generate configuration, rather than pull
Add pull secret option to OCI cloud provider
* Updated oci example to document new parameters
The inventory/mycluster directory gets created when someone follows
the instructions in README.md, but it should never be committed to
the kubespray repo. Ignore it.
* add to inventory.py script ability to indicate ip ranges
* add test for range2ip function for inventory.py script
some fixes
* add negative test for range2ip function for inventory.py script
Run only commands that apply to the current deployed cluster (only get
calico info and skip weave/flannel when deploying calico, for example).
Add helm release info if helm is deployed
This PR ensures that the e2fsprogs and xfsprogs packages are
installed on all Kubernetes nodes and that the packages are
the latest versions. It also ensures that the nodes can
create XFS filesystems when necessary, since not all distros
install xfsprogs by default.
e2fsprogs - ext2/ext3/ext4 file system utilities
xfsprogs - Utilities for managing the XFS filesystem
This fixes the issue where if there was a hosts.ini file present in the
inventory directory, then Vagrant would set an incorrect path as
ansible.inventory_path
* Calico: Ability to define the default IPPool CIDR (instead of kube_pods_subnet)
* Documentation for calico_pool_cidr (and calico_advertise_cluster_ips which has been forgotten...)
* Add support for Packet with Terraform
Co-Author: johnstudarus <john@jhlconsulting.com>
* removed advanced features to streamline
* clarifying usage
* Update README.md
provide a better test to validate things are working OK
* Update README.md
clarifying what to set
* minor wordsmithing
* Fix admin cert path
* clarifying how to configure keys
* enabling kubeconfig_localhost
pull over the configuration file via playbooks rather than the key files individually
* Create output.tf
* Add support for node specific plans
* Set cluster DNS correctly in case of nodelocal dns cache
* Pass in cluster_ip based on dns mode
* Disable nodelocaldns by default
* Fix syntax error
* Fix syntax issue
* Add nodelocadns ip to vars of node installation
* Change location of nodelocaldns_ip
* Try to remove newlines from jinja template
* Add debug for config file
* Move parameter logic outside of template
* Adapt templates after feedback
* Remove debugging
Addressing the discussion started in #4064, this PR moves kubeadm and
hyperkube binaries to /usr/local/bin before running them on the master
nodes.
It is to address the case where local_release_dir points to /tmp
(kubespray default) and /tmp is mounted with noexec mode, preventing
any binaries to be run in that partition.
In role "node", we still move kubeadm to bin_dir only on the worker
nodes.
I know this is a bit hack.
If you use cloud LB, you can use kubeadm's controlPlaneEndpoint to configure kube-proxy's server field.
But for nginx-proxy, it didn't start when kubeadm init.
* Fix random failure in debug: var=result.content|from_json
* netchecker agents are deployed on all k8s-cluster group members
* reducing limits/requests is not enough, switching to n1-standard-2
* gce_centos7 need more cpu
Looks like `epel_enabled` was not configured for the epel install in `bootstrap-centos.yml`. Also, there were no conditionals that would trigger bootstrap for RHEL.
* Use external LB IP for external api endpoint
Use loadbalancer_apiserver.address instead of apiserver_loadbalancer_domain_name for kudadm init --apiserver-advertise-address argument
https://kubernetes.io/docs/reference/setup-tools/kubeadm/kubeadm-init/#options states apiserver-advertise-address needs to be a IPv4 or IPv6 address
* only use loadbalancer IP if it is defined
I found a potential use case where `writable` could be null and therfore
not treated like a boolean, so this adds an extra default statement to
avoid negating a non-boolean as boolean which would lead to undefined. refs #4020
Looks like the template is removing the trailing space between storage
class entries, and since CI only has one storage class we never hit this
issue. This change will prevent the yaml from printing on a single line
when multiple storage classes are defined.
In v1beta1 of `ClusterConfiguration` the extraVolumes `writable` field was changed to `readOnly` and its boolean value must be negated.
Also, the json field for `useHyperKubeImage` was incorrectly capitalized.
Right now we're consistently getting warnings about kubelet not found in
path during `kubeadm init`. We fixed this for `kubeadm join` in #3342, and this brings the change to init
as well.
- Fixed an issue where storage class host directories were looped
through excessive target hosts
- Fixes examples in the LVP `README.md` to use nested dicts instead of a
list of dicts
* Makes local volume provisioner more dynamic
* Correct variable name in local storage provisioner defaults
* Updates external-provisioner readme
* Updates variable naming to be more clear, more documentation, fixes sample inventory
* Variable refactor, untangled some jinja2 loops
* Corrects variable name
* No variable substitution in dict keys, replaced with anchor
* Fixes default storage_classes dict, inline docs
* Fixes spelling in inline docs
* Addresses comments in review
* Updates all the defaults
* Fix failing CI task
* Fixes external provisioner daemonset
--limit doesn't work when using remove-node.yml as there is group listing with "hosts: kube-master" in the playbook. Thus, remove-node/pre-remove/post-remove tasks are skipped as they are filtered by group "hosts: kube-master"
* allows to override the bind addresses for controller-manager and scheduler
Useful for Prometheus metrics monitoring
* Add bind addr override support in kubeadm/v1beta1
Adds support for override of bind addresses for controller-manager
and scheduler in kubeadm/v1beta1
* Move location of bind address vars
* Remove double declaration of schedulerExtraArgs
The change implemented in #3908 remove line breaks for supplementary
addresses in kubeadm SANs, causing errors in the config file and
failure to bring cluster up. This commit reimplement line breaks in
between supplementary addresses.
- Creates and defaults an ansible variable for every configuration option in the `kubeproxy.config.k8s.io/v1alpha1` type spec
- Fixes vars that were orphaned by removing non-kubeadm
- Fixes previously harcoded kubeadm values
- Introduces a `main` directory for role default files per component (requires ansible 2.6.0+)
- Split out just `kube-proxy.yml` in this first effort
- Removes the kube-proxy server field patch task
We should continue to pull out other components from `main.yml` into their own defaults files as I did here for `defaults/main/kube-proxy.yml`. I hope for and will need others to join me in this refactoring across the project until each component config template has a matching role defaults file, with shared defaults in `kubespray-defaults` or `downloads`
The containerd service and socket files have been dropped from the
openSUSE docker package so we should not require them in the docker
service anymore. This makes the docker service file look similar to
the one shipped by the openSUSE package.
Signed-off-by: Markos Chandras <mchandras@suse.de>
Added a line documenting where to find acceptable values for the
`docker_version` setting. If you use a value that is not used as
a key value by `docker_versioned_pkg` the container-engine/docker
playbook will throw a "Unexpected templating type error". (e.g.
If you use '18.06.1' or '18.06.1-ce', neither of which is used
as a key value of `docker_versioned_pkg`, rather than '18.06',
you'll get an error when installing on Ubuntu 18.04.)
* controlPlaneEndpoint set up through load balancer should be possible even in single master setups
Enable load balancer for single-master setups
Fixes an issue where single-master setups are not reachable using the usual admin.conf from outside the cluster.
controlPlaneEndpoint set up through load balancer should be possible even in single master setups
* add fix to other api versions
* remove obsolete check completely
* remove check, pass 2
* removes checks in client configuration
* delete 'and'
* Add support for running a nodelocal dns cache
After encountering dns issues in a cluster I was recently working on I
noticed Kubernetes 1.13 introduced support for running a nodelocal dns
cache.
I believe this can usefull for more people.
73b548db06https://github.com/kubernetes/enhancements/blob/master/keps/sig-network/0030-nodelocal-dns-cache.md
* Add requested changes
* Add additional requested changes + documentation
* Add requested changes after review
* Replace incorrect variable
Setting host_architecture to allow etcd upgrade working through: ansible-playbook -b -i inventory/sample/hosts.ini cluster.yml --tags=etcd (on other case host_architecture is missing)
* Upgrade kubernetes to v1.13.0
* Remove all precense of scheduler.alpha.kubernetes.io/critical-pod in templates
* Fix cert dir
* Use kubespray v2.8 as baseline for gitlab
* Remove non-kubeadm deployment
* More cleanup
* More cleanup
* More cleanup
* More cleanup
* Fix gitlab
* Try stop gce first before absent to make the delete process work
* More cleanup
* Fix bug with checking if kubeadm has already run
* Fix bug with checking if kubeadm has already run
* More fixes
* Fix test
* fix
* Fix gitlab checkout untill kubespray 2.8 is on quay
* Fixed
* Add upgrade path from non-kubeadm to kubeadm. Revert ssl path
* Readd secret checking
* Do gitlab checks from v2.7.0 test upgrade path to 2.8.0
* fix typo
* Fix CI jobs to kubeadm again. Fix broken hyperkube path
* Fix gitlab
* Fix rotate tokens
* More fixes
* More fixes
* Fix tokens
* Remove variables defined in download role. Fixes#3799
* Cleanup some more variables
* Fix bad templating
* Minor fix
* Add dashboard to download role. Fixes#3736
* Set configure-cloud-routes=false as default if no network plugin is used
As configure-cloud-routes default value is `true`, so it need to be set to `false` when not required to avoid error messages like:
"Couldn't reconcile node routes: error listing routes: unable to find route table for AWS cluster"
on, for example, AWS installations that don't use cloud native routing.
* Update kube-controller-manager.manifest.j2
remove extra spaces
Introduced variable node_taints which can be set in inventory for
specific hosts or in group_vars, which generates --register-with-taints
command line argument for kubelet.
Introduced variable `ingress_nginx_tolerations` to set custom
tolerations for Ingress nginx daemonset, to be able to schedule
ingress-nginx on dedicated nodes with taints.
* Update defaults to match k8s 1.12 suggestions
* Test if Netchecker works with node ip instead of localhost
* Update defaults to ipvs and coredns
* Update defaults for kube_apiserver_insecure_port
* Update main.yaml
When `ansible_user` is not root, using `-b` option.
And with `download_run_once` and `download_localhost` set `true`.
Ansible will executes `container_download | upload container images to nodes` task.
It uses rsync to upload images to `/tmp/release/container/`, but the
`container` directory owned by `root`.
Now the `kubespray-aws-inventory.py` script always set a node_labels key
to ansible_host.
When AWS instance did not set property labels, it would be an empty
string.
The TASK `Write kubelet config file (kubeadm or non-kubeadm)` will
failed with a msg:
`AnsibleUndefinedVariable: 'unicode object' has no attribute 'items'`.
* Support Metrics Server as addon (#3560).
* Update metrics server v0.3.1.
* Add metrics server test.
* Replace metrics server manifests with kubernetes/cluster/addons's.
* Modify metrics server manifests for kubespray.
* Follow PR#3558 node label node-role.kubernetes.io/master change
* Fix metrics server parameters base_metrics_server_... to metrics_server_...
* Fix too hard corded metrics_server_memory_per_node
* Add configurable insecure tls for metrics-apiservice
* Downloadable addon-resizer and extract parameter as variables
* Remove metrics server version from deployment name
* Metrics Server work when all masters has node role
* Download metrics-server and add-resizer container only on master
* ServiceAccount and ConfigMap is separated and fix application name
* Remove old metrics server clusterrole template
* Fix addon-resizer image specify
* Make InternalIP default for metrics_server_kubelet_preferred_address_types
Make InternalIP default because multiple preferrred address types does not work.
comparison that happens during `TASK [kubernetes-apps/ansible : Kubernetes Apps | Lay Down CoreDNS Template]` where the `dns-autoscaler` template is deployed causes coredns to fail deployment. The error is caused by the variable `dns_prevent_single_point_failure` where an integer is being compared with a string. The resulting error:
```bash
'>' not supported between instances of 'int' and 'str'
```
prevents successful deployment of CoreDNS.
The change makes the comparison happen between integers and allows CoreDNS to succeed.
* Enable AutoScaler for CoreDNS
* Only use one template for dns autoscaler
* Rename a few variables for replicas and minimum pods
* Rename a few variables for replicas and minimum pods
* Remove replicas to make autoscale work
* Cleanup kubedns-autoscaler as it has been renamed
add prometheus annotations to calico-node if
calico_felix_prometheusmetricsenabled is enabled.
This will allow a kubernetes_sd to automaticly find the pods and start
scraping.
* Fix Failure talking to yum: Cannot find a valid baseurl for repo: base/7/x86_64 if Install packages in CentOS using proxy
* Add proxy to /etc/yum.conf if http_proxy is defined
* Added changes to clean up orphan containers and reload docker & kubelet directories.
* Added new files for cleaning up orphans and docker & kubelet directories
* Added new lines at the end of these files
* removed the trailing whitespaces from main.yml and clean-up.yml
* Updated as per the review comments
* Updated as per the review comments
* Removed service_facts and package_facts because they are not supported in ansible 2.4.0
* Corrected yaml syntax errors
* Removed the use of json_query filter and utilized selectattr
* Removed trailing spaces
* Changed the default value of docker_clean_up to false
* Added Changes to only include cleanup-docker-orphans.sh
* Reverted back changes done inside handler.
* Removed trailing spaces and made default value of docker_orphan_clean_up as true
* Reverted the default value of docker_orphan_clean_up as false
* Made the docker clean up as drop in
* Made the docker clean up as drop in
* Reverted the value of boolean docker_orphan_clean_up to false
* Converted ExecStop to ExecSTartPost. Removed the live restore check from the orphan script
* Adds support for Multus (multiple interfaces) CNI plugin
Multus is a latin word for "Multi". As the name suggests, it acts as a
Multi plugin in Kubernetes and provides multiple network interface
support in a pod. Multus uses the concept of invoking delegates by
grouping multiple plugins into delegates and invoking them in the
sequential order of the CNI configuration file provided in json format.
* Change CNI version (0.1.0->0.3.1) of Contiv to be compatible with Multus
* [contrib/terraform/openstack] Add worker_allowed_ports
Allow user to define in terraform template which ports and remote
IPs that are allowed to access worker nodes. This is useful when you
don't want to open up whole NodePort range to the outside world, or
ports outside NodePort range.
When using resolvconf_mode host_resolvconf, there is an early DNS
config stage where Kubernetes cluster DNS is not injected for host
DNS intially. Later, the cluster DNS is enabled, but we do not
need to run every task from the kubernetes/preinstall role.
* Add an 'access_ip' for openstack resources to the terraform inventory builder script
* Update Openstack README
* Only use ipv4
* If there's a floating IP assigned to an openstack instance, use that for access_ip
kube-router v0.2.1 highlights from changelog:
- IPv6 WIP but pretty close to full working functionality
- fully support network policy semantics with addition of support for
ipblock and except
* warning on meta flush_handlers
* avoid rm
* avoid "Module remote_tmp /root/.ansible/tmp did not exist and was created with a mode of 0700, this may cause issues when running as another user. To avoid this, create the remote_tmp dir with the correct permissions manually" warning on subsequent tasks using blockinfile
* is match
* failed
* version_compare
* succeeded
* skipped
* success
* version_compare becomes version since ansible 2.5
* ansible minimal version updated in doc and spec
* last version_compare
* [jjo] add kube-router support
Fixescloudnativelabs/kube-router#147.
* add kube-router as another network_plugin choice
* support most used kube-router flags via
`kube_router_foo` vars as other plugins
* implement replacing kube-proxy (--run-service-proxy=true) via
`kube_proxy_mode: none`, verified in a _non kubeadm_enabled_
install, should also work for recent kubeadm releases via
`skipKubeProxyInstall: true` config
* [jjo] address PR#3339 review from @woopstar
* add busybox image used by kube-router to downloads
* fix busybox download groups key
* rework kubeadm_enabled + kube_router_run_service_proxy
- verify it working ok w/the kubeadm_enabled and
kube_router_run_service_proxy true or false
- introduce `kube_proxy_remove` fact, to decouple logic
from kube_proxy_mode (which affects kubeadm configmap
settings, thus no-good to ab-use it to 'none')
* improve kube-router.md re: kubeadm_enabled and kube_router_run_service_proxy
* address @woopstar latest review
* add inventory/sample/group_vars/k8s-cluster/k8s-net-kube-router.yml
* fix kube_router_run_service_proxy conditional for kube-proxy removal
* fix kube_proxy_remove fact (w/ |bool), add some needed kube-proxy tags on my and existing changes
* update kube-router tolerations for 1.12 compatibility
* add PriorityClass to kube-router DaemonSet
The hosts(5) manpage clearly states that the first entry is the
"canonical name", or FQDN (Fully-Qualified Domain Name):
IP_address canonical_hostname [aliases...]
By using the alias as a first entry, `hostname -f` does not return the
correct domain which breaks all sorts of unrelated functionality (it
has impact over email server configuration, for example).
* [jjo] add DIND support to contrib/
- add contrib/dind with ansible playbook to
create "node" containers, and setup them to mimic
host nodes as much as possible (using Ubuntu images),
see contrib/dind/README.md
- nodes' /etc/hosts editing via `blockinfile` and
`lineinfile` need `unsafe_writes: yes` because /etc/hosts
are mounted by docker, and thus can't be handled atomically
(modify copy + rename)
* dind-host role: set node container hostname on creation
* add "Resulting deployment" section with some CLI outputs
* typo
* selectable node_distro: debian, ubuntu
* some fixes for node_distro: ubuntu
* cpu optimization: add early `pkill -STOP agetty`
* typo
* add centos dind support ;)
* add kubespray-dind.yaml, support fedora
- add kubespray-dind.yaml (former custom.yaml at README.md)
- rework README.md as per above
- use some YAML power to share distros' commonality
- add fedora support
* create unique /etc/machine-id and other updates
- create unique /etc/machine-id in each docker node,
used as seed for e.g. weave mac addresses
- with above, now netchecker 100% passes WoHooOO!
🎉🎉🎉
- updated README.md output from (1.12.1, verified
netcheck)
* minor typos
* fix centos node creation, needs earlier udevadm removal to avoid flaky facts, also verified netcheck Ok \o/
* add Q&D test-distros.sh, back to manual /etc/machine-id hack
* run-test-distros.sh cosmetics and minor fixes
* run-test-distros.sh: $rc fix and minor formatting changes
* run-test-distros.sh output cosmetics
* Added Priority class to tiller installation and also fixed tiller override implementation.
* Added changes to handle priority classes separately in tiller, instead of using the variable tiller_override
* Added changes to clean up orphan containers and reload docker & kubelet directories.
* Added new files for cleaning up orphans and docker & kubelet directories
* Added new lines at the end of these files
* removed the trailing whitespaces from main.yml and clean-up.yml
* Updated as per the review comments
* Updated as per the review comments
* Removed service_facts and package_facts because they are not supported in ansible 2.4.0
* Corrected yaml syntax errors
* Removed the use of json_query filter and utilized selectattr
* Removed trailing spaces
* Changed the default value of docker_clean_up to false
* Added Changes to only include cleanup-docker-orphans.sh
* Reverted back changes done inside handler.
* Removed trailing spaces and made default value of docker_orphan_clean_up as true
* Reverted the default value of docker_orphan_clean_up as false
* Made the docker clean up as drop in
* Made the docker clean up as drop in
* Reverted the value of boolean docker_orphan_clean_up to false
* #3475 - make dnsmasq to send queries to all servers in upstream. Make dnsmasq config file customizable.
* Code style fixes. Return current behaviour for dnsmasq strict-order flag.
* Fix DNS loop when resolvconf_mode is set to host_resolvconf
* Make sure upstream_dns_servers is defined when using resolvconf_mode == 'host_resolvconf'
* Only set upstream dns servers on KubeDNS and CoreDNS if they are defined
* Only set upstream dns servers on KubeDNS and CoreDNS if they are defined
- Local Volume StorageClass configuration is now manged by `local_volume_provisioner_storage_classes`, a list of maps that specifies local storage classes with `name` `host_dir` and `mount_dir` keys per entry
- Tasks and templates updated to loop through local volume storage classes
- Previous defaults for path/class names were not changed
- Fixed an issue where a `kubernetes/preinstall` was creating directories inconsistently with the `kubernetes-apps/external_provisioner/local_volume_provisioner` task
Internet access is not mandatory as long as the user configures all container image repositories to point to internal container registries, in case of on-premises installation with firewall rules preventing direct Internet access.
* Fix the jinja expression for openstack_tenant_id
OS_PROJECT_ID is obsolete in keystone v3 and jinja expression
doesn't set openstack_tenant_id as expected because of
undefined env var. Fixed the expression.
* Fix the dic iteration method in the kubelet template
Kubelet template rendering errors when additional Node lables are
added and using Python3. Update the method to be compatible to both
python2/3
Node lables doesn't work
According to the documentation, container images are described
by vars like `foo_image_repo` and `foo_image_tag`.
The variables netcheck_{agent,server}_{img_repo,tag} do not
follow that convention.
* Replace `openstack_compute_secgroup_v2` with `openstack_networking_secgroup_v2`
The `openstack_networking_secgroup_v2` resource allow specifications of
both ingress and egress. Nova security groups define ingress rules only.
This change will also allow for more user-friendly specified security
rules, as the different security group resources have different HCL
syntax.
Before, Nodes tainted with NoExecute policy did not have calico/weave Pod.
Network pod should run on all nodes whatever happens on a specific node.
Also always set the Pods to be critical.
Also remove deprecated scheduler.alpha.kubernetes.io/tolerations annotations.
to fix the follow problem in case quote is not used:
PLAY [k8s-cluster:etcd:calico-rr] **********************************************
ERROR! Syntax Error while loading YAML.
expected <block end>, but found '<scalar>'
The error appears to have been in '/tmp/vagrant-ansible/inventory/group_vars/k8s-cluster.yml': line 59, column 39, but may
be elsewhere in the file depending on the exact syntax problem.
The offending line appears to be:
kube_oidc_ca_file: {{ kube_cert_dir }}/openid-ca.pem
^ here
We could be wrong, but this one looks like it might be an issue with
missing quotes. Always quote template expression brackets when they
start a value. For instance:
with_items:
- {{ foo }}
Should be written as:
with_items:
- "{{ foo }}"
Ansible failed to complete successfully. Any error output should be
visible above. Please fix these errors and try again.
* Changes to assign pod priority to kube components.
* Removed the boolean flag pod_priority_assignment
* Created new priorityclass k8s-cluster-critical
* Created new priorityclass k8s-cluster-critical
* Fixed the trailing spaces
* Fixed the trailing spaces
* Added kube version check while creating Priority Class k8s-cluster-critical
* Moved k8s-cluster-critical.yml
* Moved k8s-cluster-critical.yml to kube_config_dir
* fix openstack cli syntax
* 'allowed-address' is also a dash, not an underscore
* multiple allowed-address
multiple allowed-address must be in separate parameters
When enable_network_policy is set to True with Calico 3 kubectl
apply fails with the error:
The Deployment "calico-kube-controllers" is invalid:
spec.strategy.rollingUpdate: Forbidden: may not be specified when
strategy type is 'Recreate'
See
https://github.com/kubernetes-incubator/kubespray/issues/3267
Changing the update strategy to RollingUpdate avoids this error.
OS_PROJECT_ID is obsolete in keystone v3 and jinja expression
doesn't set openstack_tenant_id as expected because of
undefined env var. Fixed the expression.
ensure there is pin priority for docker package to avoid upgrade of docker to incompatible version
remove empty when line
ensure there is pin priority for docker package to avoid upgrade of docker to incompatible version
force kubeadm upgrade due to failure without --force flag
ensure there is pin priority for docker package to avoid upgrade of docker to incompatible version
added nodeSelector to have compatibility with hybrid cluster with win nodes, also fix for download with missing container type
fixes in syntax and LF for newline in files
fix on yamllint check
ensure there is pin priority for docker package to avoid upgrade of docker to incompatible version
some cleanup for innecesary lines
remove conditions for nodeselector
* calico upgrade to v3
* update calico_rr version
* add missing file
* change contents of main.yml as it was left old version
* enable network policy by default
* remove unneeded task
* Fix kubelet calico settings
* fix when statement
* switch back to node-kubeconfig.yaml
This change allows the playbook used in Vagrant to be
defined by the end user.
This is useful in the case where a developer may want to use
their own playbook that imports Kubespray, but also leverage
the Kubespray Vagrantfile.
Upgrade Kubernetes to V1.11.2
The kubeadm configuration file version has been upgraded from v1alpha1 to v1alpha2
Add bootstrap kubeadm-config.yaml with external etcd
The post-remove action fails during the kubectl delete node action because with rc: 2, command not found. The kubectl is not in the system PATH and the full path to the binary is required
* Update local-volume-provisioner-ds.yml.j2
After v1.10.2 default mountPropagation is "None"
* local_volume_provisioner version bump
v2.1.0 uses the beta nodeAffinity API by default which is available starting 1.10
* Update local-volume-provisioner-ds.yml.j2
MY_NAMESPACE env
* Update README.md
Raw block devices docs.
* kubedns & kubedns-autoscaler: Stick to master nodes.
- Tolerate only master nodes and not any NoSchedule taint
- Pods are on different nodes
- Pods are required to be on a master node.
* kubedns: use soft nodeAffinity.
Prefer to be on a master node, don't require.
* coredns: Stick to (different) master nodes.
- Pods are on different nodes
- Pods are preferred to be on a master node.
According to cluster/binary.yml vault binary will be placed to `{{ bin_dir }}` and according to `inventory/sample/group_vars/all.yml` that is
`inventory/sample/group_vars/all.yml`
Attempting to clarify the language surrounding the etcd node deployment script failure mechanism. I had this error when doing a new cluster deployment last night and, though it should have been, it wasn't immediately apparent to me what was causing the issue (since my default master node hostnames do not specify whether they are also acting as etcd replicas).
ingress-nginx 0.16.2 (https://github.com/kubernetes/ingress-nginx/releases/tag/nginx-0.16.2)
This patch simplify ingress-nginx deployment by default deploy on
master, with customizable options; on the other hand, remove the
additional Ansible group "kube-ingress" and its k8s node label
injection.
Reference to https://kubernetes.io/docs/concepts/services-networking/ingress/#prerequisites:
GCE/Google Kubernetes Engine deploys an ingress controller on the master.
By changing `ingress_nginx_nodeselector` plus custom k8s node
label, user could customize the DaemonSet deployment target.
If `ingress_nginx_nodeselector` is empty, will deploy DaemonSet on
every k8s node.
- cephfs-provisioner 06fddbe2 (https://github.com/kubernetes-incubator/external-storage/tree/06fddbe2/ceph/cephfs)
Noteable changes from upstream:
- Added storage class parameters to specify a root path within the backing cephfs and, optionally, use deterministic directory and user names (https://github.com/kubernetes-incubator/external-storage/pull/696)
- Support capacity (https://github.com/kubernetes-incubator/external-storage/pull/770)
- Enable metrics server (https://github.com/kubernetes-incubator/external-storage/pull/797)
Other noteable changes:
- Clean up legacy manifests file naming
- Remove legacy manifests, namespace and storageclass before upgrade
- `cephfs_provisioner_monitors` simplified as string
- Default to new deterministic naming
- Add `reclaimPolicy` support in StorageClass
With legacy non-deterministic naming style (where $UUID are generated ramdonly):
- cephfs_provisioner_claim_root: /volumes/kubernetes
- cephfs_provisioner_deterministic_names: false
- Generated CephFS volume: /volumes/kubernetes/kubernetes-dynamic-pvc-$UUID
- Generated CephFS user: kubernetes-dynamic-user-$UUID
With new default deterministic naming style (where $NAMESPACE and $PVC are predictable):
- cephfs_provisioner_claim_root: /volumes
- cephfs_provisioner_deterministic_names: true
- Generated CephFS volume: /volumes/$NAMESPACE/$PVC
- Generated CephFS user: k8s.$NAMESPACE.$PVC
* Add supplementary node groups
To add additional ansible groups to the k8s nodes, such as
`kube-ingress` for running ingress controller pods. Empty by default.
Red Hat has this theory that binaries in sbin are too dangerous to be on
the default path, but we need them anyway.
RH7 has /sbin and /usr/sbin as symlinks, so that is no longer important.
I'm adding it to the `PATH` instead of making the path to `modinfo`
absolute because I am worried about breaking support for other
distributions.
The README says to check if Python and pip are installed type:
```
python -v && pip -v
```
Lowercase `-v` is `--verbose`, uppercase `-V` is `--version`. The
command should be:
```
python -V && pip -V
```
Currently all the gcr.io images used in kubespray can only run on x86.
Also gcr.io has not fully support multi-arch docker images.
Add extra var "image_arch" (default is amd64) to support running other
platforms, like arm64.
Change-Id: I8e1c9af533c021cb96ade291a1ce58773b40e271
On Aarch64, the default cgroup driver for docker is systemd
instead of cgroupfs. Should conform kubelet to use systemd
as cgroup driver as well to keep it consistent with docker.
Without this change, below exception will be raised.
/usr/bin/docker-current: Error response from daemon: shim
error: docker-runc not installed on system.
Change-Id: Id496ec9eaac6580e4da2f3ef1a386c9abc2a5129
The number of pods on a given node is determined by the --max-pods=k
directive. When the address space is exhausted, no more pods can be
scheduled even if from the --max-pods-perspective, the node still has
capacity.
The special case that a pod is scheduled and uses the node IP in the
host network namespace is too "soft" to derive a guarantee.
Comparing kubelet_max_pods with kube_network_node_prefix when given
allows to assert that pod limits match the CIDR address space.
* Move front-proxy-client certs back to kube mount
We want the same CA for all k8s certs
* Refactor vault to use a third party module
The module adds idempotency and reduces some of the repetitive
logic in the vault role
Requires ansible-modules-hashivault on ansible node and hvac
on the vault hosts themselves
Add upgrade test scenario
Remove bootstrap-os tags from tasks
* fix upgrade issues
* improve unseal logic
* specify ca and fix etcd check
* Fix initialization check
bump machine size
* [terraform/openstack] Restores ability to use existing public nodes and masters as bastion.
* [terraform/openstack] Uses network_id as output
* [terraform/openstack] Fixes link to inventory/local/group_vars
* [terraform/openstack] Adds supplementary master groups
* [terraform/openstack] Updates documentation avoiding manual setups for bastion (as they are not needed now).
* [terraform/openstack] Supplementary master groups in docs.
* [terraform/openstack] Fixes repeated usage of master fips instead of bastion fips
* [terraform/openstack] Missing change for network_id to subnet_id
* [terraform/openstack] Changes conditional to element( concat ) form to avoid type issues with empty lists.
* sysctl file should be in defaults so that it can be overriden
* Change sysctl_file_path to be consistent with roles/kubernetes/preinstall/defaults/main.yml
pip was always being downloaded on subsequent runs, This PR always runs the pip command, and checks the rc of it before downloading pip
Fix in favor of #2582
Kubespray should not install any helm charts. This is a task
that a user should do on his/her own through ansible or another
tool. It opens the door to wrapping installation of any helm
chart.
Change to support multiple inventory path led to Vagrant environment not
getting a default group_vars in it's inventory path. Using sample as the
default path if none specified.
Fix issue #2541
Signed-off-by: Ganesh Maharaj Mahalingam <ganesh.mahalingam@intel.com>
The RPM file that's provided by upstream can be used for SUSE
distributions as well. Moreover we simplify the playbook to use
the 'package' module to install packages across different distros.
Link: https://github.com/rkt/rkt/pull/3904
If the 'docker' package is already installed, then the handlers will not
run and the service will not be (re-)started. As such, lets make sure
that the service is started even if the packages are already installed.
Add support for installing Docker on SUSE distributions. The Docker
repository at https://yum.dockerproject.org/repo/main/ does not support
recent openSUSE distributions so the only alternative is to use the
packages from the distro repositories. This however renders the
'docker_version' Ansible variable useless on SUSE.
The openssl package on Tumbleweed is actually a virtual package covering
openssl-1.0.0 and openssl-1.1.0 implementations. It defaults to 1.1.0 so
when trying to install it and openssl-1.0.0 is installed, zypper fails
with conflicts. As such, lets explicitly pull the package that we need
which also updates the virtual one.
Co-authored-by: Markos Chandras <mchandras@suse.de>
Depending on the VM configuration, vagrant may either use 'rsync' or
vboxfs for populating the working directory to the VM. However, vboxfs
means that any files created by the VM will also be present on the host.
As such, lets be explicit and always use 'rsync' to copy the directory
to the VM so we can keep the host copy clean. Moreover, the default
rsync options include '--copy-links' and this breaks rsync if there are
missing symlinks in the working directory like the following one:
Error: symlink has no referent:
"/home/user/kubespray/contrib/network-storage/glusterfs/group_vars"
As such, we override the default options to drop --copy-links.
While `do` looks cleaner, forcing this extra option in ansible.cfg
seems to be more invasive. It would be better to keep the traditional
approach of `set dummy = ` instead.
The current way to setup the etc cluster is messy and buggy.
- It checks for cluster is healthy before the cluster is even created.
- The unit files are started on handlers, not in the task, so you mess with "flush handlers".
- The join_member.yml is not used.
- etcd events cluster is not configured for kubeadm
- remove duplicate runs between running the role on etcd nodes and k8s nodes
* Remove old docker packages
This removes docker packages that are obsolete if docker-ce packages are to be installed, which fixes some package conflict issues that can occur during upgrades.
* Add support for setting obsoletes=0 when installing docker with yum
The default for kibana_base_url does not make sense an makes kibana unusable. The default path forces a 404 when you try to open kibana in the browser. Not setting kibana_base_url works just fine.
Added CoreDNS to downloads
Updated with labels. Should now work without RBAC too
Fix DNS settings on hosts
Rename CoreDNS service from kube-dns to coredns
Add rotate based on http://edgeofsanity.net/rant/2017/12/20/systemd-resolved-is-broken.html
Updated docs with CoreDNS info
Added labels and fixed minor settings from official yaml file: https://github.com/kubernetes/kubernetes/blob/release-1.9/cluster/addons/dns/coredns.yaml.sed
Added a secondary deployment and secondary service ip. This is to mitigate dns timeouts and create high resitency for failures. See discussion at 'https://github.com/coreos/coreos-kubernetes/issues/641#issuecomment-281174806'
Set dns list correct. Thanks to @whereismyjetpack
Only download KubeDNS or CoreDNS if selected
Move dns cleanup to its own file and import tasks based on dns mode
Fix install of KubeDNS when dnsmask_kubedns mode is selected
Add new dns option coredns_dual for dual stack deployment. Added variable to configure replicas deployed. Updated docs for dual stack deployment. Removed rotate option in resolv.conf.
Run DNS manifests for CoreDNS and KubeDNS
Set skydns servers on dual stack deployment
Use only one template for CoreDNS dual deployment
Set correct cluster ip for the dns server
Flannel use interface for inter-host communication setted on --iface options
Defaults to the interface for the default route on the machine.
flannel config set via daemonset, and flannel config on all nodes is the same.
But different nodes can have different interface names for the inter-host communication network
The option --iface-regex allows the flannel to find the interface on which the address is set from the inter-host communication network
* Added option for encrypting secrets to etcd
* Fix keylength to 32
* Forgot the default
* Rename secrets.yaml to secrets_encryption.yaml
* Fix static path for secrets file to use ansible variable
* Rename secrets.yaml.j2 to secrets_encryption.yaml.j2
* Base64 encode the token
* Fixed merge error
* Changed path to credentials dir
* Update path to secrets file which is now readable inside the apiserver container. Set better file permissions
* Add encryption option to k8s-cluster.yml
Setting the following:
```
kube_kubeadm_controller_extra_args:
address: 0.0.0.0
terminated-pod-gc-threshold: "100"
```
Results in `terminated-pod-gc-threshold: 100` in the kubeadm config file. But it has to be a string to work.
* Multiple files are now supported across operations.
* Can be specified as a list or a comma separated string.
* Single item per task params will still work without changes.
* Added `files`, `filenames`, and `file`, as aliases for the `filename` param.
* Improved output of error message to always include stderr
* `exists` now supports checking files
Follow up PRs encouraged across roles to start converting `with_items` loops on `kube` tasks into `files` param lists so we can improve performance.
This is trying to match what the roles/bastion-ssh-config is trying to do. When the setup is going through bastion, we want to ssh private key to be used on the bastion instance.
to the API server configuration.
This solves the problem where if you have non-resolvable node names,
and try to scale the server by adding new nodes, kubectl commands
start to fail for newly added nodes, giving a TCP timeout error when
trying to resolve the node hostname against a public DNS.
Adding this into the default example inventory so it has less of a chance of biting others after weeks of random failures (as etcd does not express that it has run out of RAM it just stalls).. 512MB was not enough for us to run one of our products.
* Fix run kubectl error
Fix run kubectl error when first master doesn't work
* if access_ip is define use first_kube_master
else different master use a different ip
* Delete set first_kube_master and use kube_apiserver_access_address
* Set filemode to 0640
weave-net.yml file is readable by all users on the host. It however contains the weave_password to encrypt all pod communication. It should only be readable by root.
* Set mode 0640 on users_file with basic auth
* Added cilium support
* Fix typo in debian test config
* Remove empty lines
* Changed cilium version from <latest> to <v1.0.0-rc3>
* Add missing changes for cilium
* Add cilium to CI pipeline
* Fix wrong file name
* Check kernel version for cilium
* fixed ci error
* fixed cilium-ds.j2 template
* added waiting for cilium pods to run
* Fixed missing EOF
* Fixed trailing spaces
* Fixed trailing spaces
* Fixed trailing spaces
* Fixed too many blank lines
* Updated tolerations,annotations in cilium DS template
* Set cilium_version to iptables-1.9 to see if bug is fixed in CI
* Update cilium image tag to v1.0.0-rc4
* Update Cilium test case CI vars filenames
* Add optional prometheus flag, adjust initial readiness delay
* Update README.md with cilium info
When etcd exceeds its memory limit, it becomes useless but keeps running.
We should let OOM killer kill etcd process in the container, so systemd can spot
the problem and restart etcd according to "Restart" setting in etcd.service unit file.
If OOME problem keep repeating, i.e. it happens every single restart,
systemd will eventually back off and stop restarting it anyway.
--restart=on-failure:5 in this file has no effect because memory allocation error
doesn't by itself cause the process to die
Related: https://github.com/kubernetes-incubator/kubespray/blob/master/roles/etcd/templates/etcd-docker.service.j2
This kind of reverts a change introduced in #1860.
Even though there it kubeadm_token_ttl=0 which means that kubeadm token never expires, it is not present in `kubeadm token list` after cluster is provisioned (at least after it is running for some time) and there is issue regarding this https://github.com/kubernetes/kubeadm/issues/335, so we need to create a new temporary token during the cluster upgrade.
Ansible automatically installs the python-apt package when using
the 'apt' Ansible module, if python-apt is not present. This patch
removes the (unneeded) explicit installation in the Kubespray
'preinstall' role.
The default path assumes that the vagrant dir is called 'inventory'.
With custom defined inventory dirs that are not called 'inventory' this
fails to create the correct symlink under .vagrant.d.
In some installation, it can take up to 3sec to get the value. Retrying
for 5 sec will ensure the command won't return 1.
Signed-off-by: Sébastien Han <seb@redhat.com>
* allow installs to not have hostname overriden with fqdn from inventory
* calico-config no longer requires local as and will default to global
* when cloudprovider is not defined, use the inventory_hostname for cni-calico
* allow reset to not restart network (buggy nodes die with this cmd)
* default kube_override_hostname to inventory_hostname instead of ansible_hostname
The "centos/7" box is the official centos box and supports all the major
providers:
virtualbox Externally hosted (cloud.centos.org)
vmware_desktop Externally hosted (cloud.centos.org)
libvirt Externally hosted (cloud.centos.org)
hyperv Externally hosted (cloud.centos.org)
Where bento/centos-7.3 only supports:
parallels Hosted by Vagrant Cloud (570 MB)
virtualbox Hosted by Vagrant Cloud (525 MB)
vmware_desktop Hosted by Vagrant Cloud (608 MB)
Signed-off-by: Sébastien Han <seb@redhat.com>
When testing deployments of SDS, it is quite useful to get a Kubernetes
env with nodes having dedicated drives.
You can now enable this by setting: kube_node_instances_with_disks: true
Also you can chose the amount of drives per machine and their respective
size:
* kube_node_instances_with_disks_number: 10
* kube_node_instances_with_disks_size: "20G"
Signed-off-by: Sébastien Han <seb@redhat.com>
Cloud resolvers are mandatory for hosts on GCE and OpenStack
clouds. The 8.8.8.8 alternative resolver was dropped because
there is already a default nameserver. The new var name
reflects the purpose better.
Also restart apiserver when modifying dns settings.
If you configure your external loadbalancer to do a simple tcp pass-through to the api servers, and you do not use a DNS FQDN but just the ip, then you need to add the ip adress to the certificates too.
Example config:
```
## External LB example config
apiserver_loadbalancer_domain_name: "10.50.63.10"
loadbalancer_apiserver:
address: 10.50.63.10
port: 8383
```
Some installation are failing to authenticate with peers due to
etcd picking up/resoling the wrong node.
By setting 'etcd_peer_client_auth' to "False" you can disable peer client cert
authentication.
Signed-off-by: Sébastien Han <seb@redhat.com>
* Update rpm spec and pbr setup configs
* Rename package to kubespray
* Do not break Fedora's FHS and install to /usr/share instead
* Remove the vendor tag
* Update source0 for better artifacts' names
* Fix missing files build errors
* Make version/release to auto match from git and fit PEP 440
Co-authored-by: Matthias Runge <mrunge@redhat.com>
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
* Add package paths to roles search in ansible conf
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
* Poke jinja2 requirements in rpm spec file
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
kube-proxy is complaining of missing modules at startup. There is a plan
to also support an LVS implementation of kube-proxy in additon to
userspace and iptables
* Fix HA docs API access endpoints explained
Follow-up commit 81347298a3
and fix the endpoint value provided in HA docs.
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
* Clarify internal LB with external LB use case
* Clarify how to use both internal and external, non-cluster aware and
not managed with Kubespray, LB solutions.
* Clarify the requirements, like TLS/SSL termination, for such an external LB.
Unlike to the 'cluster-aware' external LB config, endpoints' security must be
managed by that non-cluster aware external LB.
* Note that masters always contact their local apiservers via https://bip:sp.
It's highly unlikely to go down and it reduces latency that might be
introduced when going host->lb->host. Only computes go that path.
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
* Add a note for supplementary_addresses_in_ssl_keys
Explain how to benefit from supplementary_addresses_in_ssl_keys
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
Hardcoded variables are removed from variables.tf file because it might
not be suitable for all OpenStack Cloud depending on Identity API
version available (between v2 or v3) and preferred authentication
method.
Auto configure API access endpoint with a custom bind IP, if provided.
Fix HA docs' http URLs are https in fact, clarify the insecure vs secure
API access modes as well.
Closes: #issues/2051
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
Update checksum for kubeadm
Use v1.9.0 kubeadm params
Include hash of ca.crt for kubeadm join
Update tag for testing upgrades
Add workaround for testing upgrades
Remove scale CI scenarios because of slow inventory parsing
in ansible 2.4.x.
Change region for tests to us-central1 to
improve ansible performance
Starting with Kubernetes v1.8.4, kubelet ignores the AWS cloud
provider string and uses the override hostname, which fails
Node admission checks.
Fixes#2094
The search line in /etc/resolv.conf could have
multiple spaces or tabs between domains.
split(' ') will give wrong results in some case,
use split() without argument instead.
e.g.
>>> 'domain.tld cluster.tld '.split(' ')
['domain.tld\tcluster.tld', '']
>>> 'domain.tld cluster.tld '.split()
['domain.tld', 'cluster.tld']
As we have seen with other containers, sometimes container removal fails on the first attempt due to some Docker bugs. Retrying typically corrects the issue.
Use a etcd-initer init container to generate etcd args, it determines
etcd name by comparing its ip and etcd cluster ips. This way will
make etcd configuration independent to the ansible templating so
that could be easier on adding master nodes.
Putting contiv etcd and etcd-proxy into the same daemonset and manage
the difference by a env file is not good for scaling (adding nodes).
This commit split them into two daemonsets so that when adding nodes,
k8s could automatically starting a etcd-proxy on new nodes without need
to run related play that putting env file.
* Remove the network device created by the flannel
Remove the network device created by the flannel
* Modify flannel.1 device path
Modify flannel.1 device path
* remove trailing spaces
This allows `kube_apiserver_insecure_port` to be set to 0 (disabled).
Rework of #1937 with kubeadm support
Also, fixed an issue in `kubeadm-migrate-certs` where the old apiserver cert was copied as the kubeadm key
* Adding bastion and private network provisioning for openstack terraform
* Remove usage of floating-ip property
* Combine openstack instances + floating ips
* Fix relating floating IPs to hosts for openstack builds
* Tighten up security groups
Allow ssh into all instances with floating IP
* Add the gluster hosts to the no-floating group
* Break terraform into modules
* Update README and var descriptions to match current config
* Remove volume property in gluster compute def
* Include cluster name in internal network and router names
* Make dns_nameservers a variable
* Properly tag instances and subnets with `kubernetes.io/cluster/$cluster_name`
This is required by kubernetes to support multiple clusters in a single vpc/az
* Get rid of loadbalancer_apiserver_address as it is no longer needed
* Dynamically retrieve aws_bastion_ami latest reference by querying AWS rather than hard coded
* Dynamically retrieve the list of availability_zones instead of needing to have them hard coded
* Limit availability zones to first 2, using slice extrapolation function
* Replace the need for hardcoded variable "aws_cluster_ami" by the data provided by Terraform
* Move ami choosing to vars, so people don't need to edit create infrastructure if they want another vendor image (as suggested by @atoms)
* Make name of the data block agnostic of distribution, given there are more than one distribution supported
* Add documentation about other distros being supported and what to change in which location to make these changes
* Allow setting --bind-address for apiserver hyperkube
This is required if you wish to configure a loadbalancer (e.g haproxy)
running on the master nodes without choosing a different port for the
vip from that used by the API - in this case you need the API to bind to
a specific interface, then haproxy can bind the same port on the VIP:
root@overcloud-controller-0 ~]# netstat -taupen | grep 6443
tcp 0 0 192.168.24.6:6443 0.0.0.0:* LISTEN 0 680613 134504/haproxy
tcp 0 0 192.168.24.16:6443 0.0.0.0:* LISTEN 0 653329 131423/hyperkube
tcp 0 0 192.168.24.16:6443 192.168.24.16:58404 ESTABLISHED 0 652991 131423/hyperkube
tcp 0 0 192.168.24.16:58404 192.168.24.16:6443 ESTABLISHED 0 652986 131423/hyperkube
This can be achieved e.g via:
kube_apiserver_bind_address: 192.168.24.16
* Address code review feedback
* Update kube-apiserver.manifest.j2
* Add Contiv support
Contiv is a network plugin for Kubernetes and Docker. It supports
vlan/vxlan/BGP/Cisco ACI technologies. It support firewall policies,
multiple networks and bridging pods onto physical networks.
* Update contiv version to 1.1.4
Update contiv version to 1.1.4 and added SVC_SUBNET in contiv-config.
* Load openvswitch module to workaround on CentOS7.4
* Set contiv cni version to 0.1.0
Correct contiv CNI version to 0.1.0.
* Use kube_apiserver_endpoint for K8S_API_SERVER
Use kube_apiserver_endpoint as K8S_API_SERVER to make contiv talks
to a available endpoint no matter if there's a loadbalancer or not.
* Make contiv use its own etcd
Before this commit, contiv is using a etcd proxy mode to k8s etcd,
this work fine when the etcd hosts are co-located with contiv etcd
proxy, however the k8s peering certs are only in etcd group, as a
result the etcd-proxy is not able to peering with the k8s etcd on
etcd group, plus the netplugin is always trying to find the etcd
endpoint on localhost, this will cause problem for all netplugins
not runnign on etcd group nodes.
This commit make contiv uses its own etcd, separate from k8s one.
on kube-master nodes (where net-master runs), it will run as leader
mode and on all rest nodes it will run as proxy mode.
* Use cp instead of rsync to copy cni binaries
Since rsync has been removed from hyperkube, this commit changes it
to use cp instead.
* Make contiv-etcd able to run on master nodes
* Add rbac_enabled flag for contiv pods
* Add contiv into CNI network plugin lists
* migrate contiv test to tests/files
Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com>
* Add required rules for contiv netplugin
* Better handling json return of fwdMode
* Make contiv etcd port configurable
* Use default var instead of templating
* roles/download/defaults/main.yml: use contiv 1.1.7
Signed-off-by: Cristian Staretu <cristian.staretu@gmail.com>
Move RS to deployment so no need to take care of the revision history
limits :
- Delete the old RS
- Make Calico manifest a deployment
- move deployments to apps/v1beta2 API since Kubernetes 1.8
* Defaults for apiserver_loadbalancer_domain_name
When loadbalancer_apiserver is defined, use the
apiserver_loadbalancer_domain_name with a given default value.
Fix unconsistencies for checking if apiserver_loadbalancer_domain_name
is defined AND using it with a default value provided at once.
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
* Define defaults for LB modes in common defaults
Adjust the defaults for apiserver_loadbalancer_domain_name and
loadbalancer_apiserver_localhost to come from a single source, which is
kubespray-defaults. Removes some confusion and simplefies the code.
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
Thought this wasn't required at first but I forgot there's no auto flush at the end of these tasks since the `kubernetes/master` role is not the end of the play.
* Fixes an issue where apiserver and friends (controller manager, scheduler) were prevented from restarting after manifests/secrets are changed. This occurred when a replaced kubelet doesn't reconcile new master manifests, which caused old master component versions to linger during deployment. In my case this was causing upgrades from k8s 1.6/1.7 -> k8s 1.8 to fail
* Improves transitions from kubelet container to host kubelet by preventing issues where kubelet container reappeared during the deployment
I think there was a mistake here:
"{{ peer_with_calico_rr is defined and peer_with_calico_rr }} and kube_network_plugin == 'calico'"
should be
"{{ peer_with_calico_rr is defined and peer_with_calico_rr and kube_network_plugin == 'calico' }}"
this is causing calico_rr to be download even if you are using something other than calico
This allows `kube_apiserver_insecure_port` to be set to 0 (disabled). It's working, but so far I have had to:
1. Make the `uri` module "Wait for apiserver up" checks use `kube_apiserver_port` (HTTPS)
2. Add apiserver client cert/key to the "Wait for apiserver up" checks
3. Update apiserver liveness probe to use HTTPS ports
4. Set `kube_api_anonymous_auth` to true to allow liveness probe to hit apiserver's /healthz over HTTPS (livenessProbes can't use client cert/key unfortunately)
5. RBAC has to be enabled. Anonymous requests are in the `system:unauthenticated` group which is granted access to /healthz by one of RBAC's default ClusterRoleBindings. An equivalent ABAC rule could allow this as well.
Changes 1 and 2 should work for everyone, but 3, 4, and 5 require new coupling of currently independent configuration settings. So I also added a new settings check.
Options:
1. The problem goes away if you have both anonymous-auth and RBAC enabled. This is how kubeadm does it. This may be the best way to go since RBAC is already on by default but anonymous auth is not.
2. Include conditional templates to set a different liveness probe for possible combinations of `kube_apiserver_insecure_port = 0`, RBAC, and `kube_api_anonymous_auth` (won't be possible to cover every case without a guaranteed authorizer for the secure port)
3. Use basic auth headers for the liveness probe (I really don't like this, it adds a new dependency on basic auth which I'd also like to leave independently configurable, and it requires encoded passwords in the apiserver manifest)
Option 1 seems like the clear winner to me, but is there a reason we wouldn't want anonymous-auth on by default? The apiserver binary defaults anonymous-auth to true, but kubespray's default was false.
* Change deprecated vagrant ansible flag 'sudo' to 'become'
* Emphasize, that the name of the pip_pyton_modules is only considered in coreos
* Remove useless unused variable
* Fix warning when jinja2 template-delimiters used in when statement
There is no need for jinja2 template-delimiters like {{ }} or {% %}
any more. They can just be omitted as described in https://github.com/ansible/ansible/issues/22397
* Fix broken link in getting-started guide
* Change deprecated vagrant ansible flag 'sudo' to 'become'
* Workaround ansible bug where access var via dict doesn't get real value
When accessing a variable via it's name "{{ foo }}" its value is
retrieved. But when the variable value is retrieved via the vars-dict
"{{ vars['foo'] }}" this doesn't resolve the expression of the variable
any more due to a bug. So e.g. a expression foo="{{ 1 == 1 }}" isn't
longer resolved but just returned as string "1 == 1".
* Make file yamllint complient
When proxy vars are set, `uri` module tasks will attempt to route traffic through the proxy. This causes the "Wait for" tasks in the `etcd` and `kubernetes/master` roles to hang, as localhost connections struggle with a proxy.
As far as I know these roles only need local/cluster networking, so a proxy doesn't apply here anyway.
Some time ago I think the hardcoded `/var/lib/docker` was required, but kubelet running in a container has been aware of the Docker path since at least as far back as k8s 1.6.
Without this change, you see a large number of errors in the kubelet logs if you installed with a non-default `docker_daemon_graph`
This allows overriding of apt repo endpoints when internet sources are not accessible. Additionally, switch to using the dockerproject.org gpg key url for apt instead of keyservers.net
about: Support request or question relating to Kubespray
labels: triage/support
---
<!--
STOP -- PLEASE READ!
GitHub is not the right place for support requests.
If you're looking for help, check [Stack Overflow](https://stackoverflow.com/questions/tagged/kubespray) and the [troubleshooting guide](https://kubernetes.io/docs/tasks/debug-application-cluster/troubleshooting/).
You can also post your question on the [Kubernetes Slack](http://slack.k8s.io/) or the [Discuss Kubernetes](https://discuss.kubernetes.io/) forum.
If the matter is security related, please disclose it privately via https://kubernetes.io/security/.
<!-- Thanks for sending a pull request! Here are some tips for you:
1. If this is your first time, please read our contributor guidelines: https://git.k8s.io/community/contributors/guide#your-first-contribution and developer guide https://git.k8s.io/community/contributors/devel/development.md#development-guide
2. Please label this pull request according to what type of issue you are addressing, especially if this is a release targeted pull request. For reference on required PR/issue labels, read here:
3. Ensure you have added or ran the appropriate tests for your PR: https://git.k8s.io/community/contributors/devel/testing.md
4. If you want *faster* PR reviews, read how: https://git.k8s.io/community/contributors/guide/pull-requests.md#best-practices-for-faster-reviews
5. Follow the instructions for writing a release note: https://git.k8s.io/community/contributors/guide/release-notes.md
6. If the PR is unfinished, see how to mark it: https://git.k8s.io/community/contributors/guide/pull-requests.md#marking-unfinished-pull-requests
-->
**What type of PR is this?**
> Uncomment only one ` /kind <>` line, hit enter to put that in a new line, and remove leading whitespaces from that line:
>
> /kind api-change
> /kind bug
> /kind cleanup
> /kind design
> /kind documentation
> /kind failing-test
> /kind feature
> /kind flake
**What this PR does / why we need it**:
**Which issue(s) this PR fixes**:
<!--
*Automatically closes linked issue when PR is merged.
Usage: `Fixes #<issue number>`, or `Fixes (paste link of issue)`.
_If PR is about `failing-tests or flakes`, please post the related issues/tests in a comment and do not use `Fixes`_*
-->
Fixes #
**Special notes for your reviewer**:
**Does this PR introduce a user-facing change?**:
<!--
If no, just write "NONE" in the release-note block below.
If yes, a release note is required:
Enter your extended release note in the block below. If the PR requires additional action from users switching to the new release, include the string "action required".
If you have questions, join us on the [kubernetes slack](https://kubernetes.slack.com), channel **#kubespray**.
If you have questions, check the [documentation](https://kubespray.io) and join us on the [kubernetes slack](https://kubernetes.slack.com), channel **\#kubespray**.
You can get your invite [here](http://slack.k8s.io/)
- Can be deployed on **AWS, GCE, Azure, OpenStack or Baremetal**
- **High available** cluster
- Can be deployed on **AWS, GCE, Azure, OpenStack, vSphere, Packet (bare metal), Oracle Cloud Infrastructure (Experimental), or Baremetal**
- **Highly available** cluster
- **Composable** (Choice of the network plugin for instance)
Note: When Ansible is already installed via system packages on the control machine, other python packages installed via `sudo pip install -r requirements.txt` will go to a different directory tree (e.g. `/usr/local/lib/python2.7/dist-packages` on Ubuntu) from Ansible's (e.g. `/usr/lib/python2.7/dist-packages/ansible` still on Ubuntu).
As a consequence, `ansible-playbook` command will fail with:
```raw
ERROR! no action detected in task. This often indicates a misspelled module name, or incorrect module path.
```
probably pointing on a task depending on a module present in requirements.txt (i.e. "unseal vault").
One way of solving this would be to uninstall the Ansible package and then, to install it via pip but it is not always possible.
A workaround consists of setting `ANSIBLE_LIBRARY` and `ANSIBLE_MODULE_UTILS` environment variables respectively to the `ansible/modules` and `ansible/module_utils` subdirectories of pip packages installation location, which can be found in the Location field of the output of `pip show [package]` before executing `ansible-playbook`.
### Vagrant
For Vagrant we need to install python dependencies for provisioning tasks.
Check if Python and pip are installed:
```ShellSession
python -V && pip -V
```
If this returns the version of the software, you're good to go. If not, download and install Python from here <https://www.python.org/downloads/source/>
Install the necessary requirements
```ShellSession
sudo pip install -r requirements.txt
vagrant up
```
## Documents
- [Requirements](#requirements)
- [Kubespray vs ...](docs/comparisons.md)
- [Getting started](docs/getting-started.md)
- [Ansible inventory and tags](docs/ansible.md)
- [Integration with existing ansible repo](docs/integration.md)
- [Deployment data variables](docs/vars.md)
- [DNS stack](docs/dns-stack.md)
- [HA mode](docs/ha-mode.md)
- [Network plugins](#network-plugins)
- [Vagrant install](docs/vagrant.md)
- [CoreOS bootstrap](docs/coreos.md)
- [Debian Jessie setup](docs/debian.md)
- [openSUSE setup](docs/opensuse.md)
- [Downloaded artifacts](docs/downloads.md)
- [Cloud providers](docs/cloud.md)
- [OpenStack](docs/openstack.md)
- [AWS](docs/aws.md)
- [Azure](docs/azure.md)
- [vSphere](docs/vsphere.md)
- [Packet Host](docs/packet.md)
- [Large deployments](docs/large-deployments.md)
- [Upgrades basics](docs/upgrades.md)
- [Roadmap](docs/roadmap.md)
## Supported Linux Distributions
- **Container Linux by CoreOS**
- **Debian** Buster, Jessie, Stretch, Wheezy
- **Ubuntu** 16.04, 18.04
- **CentOS/RHEL** 7
- **Fedora** 28
- **Fedora/CentOS** Atomic
- **openSUSE** Leap 42.3/Tumbleweed
- **Oracle Linux** 7
Note: Upstart/SysV init based OS types are not supported.
[docker](https://www.docker.com/) v1.13 (see note)<br>
[rkt](https://coreos.com/rkt/docs/latest/) v1.21.0 (see Note 2)<br>
Note: The list of validated [docker versions](https://github.com/kubernetes/kubernetes/blob/master/CHANGELOG-1.16.md) was updated to 1.13.1, 17.03, 17.06, 17.09, 18.06, 18.09. kubeadm now properly recognizes Docker 18.09.0 and newer, but still treats 18.06 as the default supported version. The kubelet might break on docker's non-standard version numbering (it no longer uses semantic versioning). To ensure auto-updates don't break your cluster look into e.g. yum versionlock plugin or apt pin).
Note: kubernetes doesn't support newer docker versions. Among other things kubelet currently breaks on docker's non-standard version numbering (it no longer uses semantic versioning). To ensure auto-updates don't break your cluster look into e.g. yum versionlock plugin or apt pin).
## Requirements
Note 2: rkt support as docker alternative is limited to control plane (etcd and
kubelet). Docker is still used for Kubernetes cluster workloads and network
plugins' related OS services. Also note, only one of the supported network
plugins can be deployed for a given single cluster.
- **Minimum required version of Kubernetes is v1.15**
- **Ansible v2.7.16 and python-netaddr is installed on the machine that will run Ansible commands**
- **Jinja 2.9 (or newer) is required to run the Ansible Playbooks**
- The target servers must have **access to the Internet** in order to pull docker images. Otherwise, additional configuration is required (See [Offline Environment](https://github.com/kubernetes-sigs/kubespray/blob/master/docs/downloads.md#offline-environment))
- The target servers are configured to allow **IPv4 forwarding**.
- **Your ssh key must be copied** to all the servers part of your inventory.
- The **firewalls are not managed**, you'll need to implement your own rules the way you used to.
in order to avoid any issue during deployment you should disable your firewall.
- If kubespray is ran from non-root user account, correct privilege escalation method
should be configured in the target servers. Then the `ansible_become` flag
or command parameters `--become or -b` should be specified.
Requirements
--------------
Hardware:
These limits are safe guarded by Kubespray. Actual requirements for your workload can differ. For a sizing guide go to the [Building Large Clusters](https://kubernetes.io/docs/setup/cluster-large/#size-of-master-and-master-components) guide.
***Ansible v2.4 (or newer) and python-netaddr is installed on the machine
that will run Ansible commands**
***Jinja 2.9 (or newer) is required to run the Ansible Playbooks**
* The target servers must have **access to the Internet** in order to pull docker images.
* The target servers are configured to allow **IPv4 forwarding**.
***Your ssh key must be copied** to all the servers part of your inventory.
* The **firewalls are not managed**, you'll need to implement your own rules the way you used to.
in order to avoid any issue during deployment you should disable your firewall.
-Master
- Memory: 1500 MB
- Node
- Memory: 1024 MB
## Network Plugins
## Network plugins
You can choose between 10 network plugins. (default: `calico`, except Vagrant uses `flannel`)
You can choose between 4 network plugins. (default: `calico`, except Vagrant uses `flannel`)
- [canal](https://github.com/projectcalico/canal): a composition of calico and flannel plugins.
* [**canal**](https://github.com/projectcalico/canal): a composition of calico and flannel plugins.
- [cilium](http://docs.cilium.io/en/latest/): layer 3/4 networking (as well as layer 7 to protect and secure application protocols), supports dynamic insertion of BPF bytecode into the Linux kernel to implement security services, networking and visibility logic.
* [**weave**](docs/weave.md): Weave is a lightweight container overlay network that doesn't require an external K/V database cluster. <br>
(Please refer to `weave` [troubleshooting documentation](http://docs.weave.works/weave/latest_release/troubleshooting.html)).
- [contiv](docs/contiv.md): supports vlan, vxlan, bgp and Cisco SDN networking. This plugin is able to
apply firewall policies, segregate containers in multiple network and bridging pods onto physical networks.
- [weave](docs/weave.md): Weave is a lightweight container overlay network that doesn't require an external K/V database cluster.
(Please refer to `weave` [troubleshooting documentation](https://www.weave.works/docs/net/latest/troubleshooting/)).
- [kube-ovn](docs/kube-ovn.md): Kube-OVN integrates the OVN-based Network Virtualization with Kubernetes. It offers an advanced Container Network Fabric for Enterprises.
- [kube-router](docs/kube-router.md): Kube-router is a L3 CNI for Kubernetes networking aiming to provide operational
simplicity and high performance: it uses IPVS to provide Kube Services Proxy (if setup to replace kube-proxy),
iptables for network policies, and BGP for ods L3 networking (with optionally BGP peering with out-of-cluster BGP peers).
It can also optionally advertise routes to Kubernetes cluster Pods CIDRs, ClusterIPs, ExternalIPs and LoadBalancerIPs.
- [macvlan](docs/macvlan.md): Macvlan is a Linux network driver. Pods have their own unique Mac and Ip address, connected directly the physical (layer 2) network.
- [multus](docs/multus.md): Multus is a meta CNI plugin that provides multiple network interface support to pods. For each interface Multus delegates CNI calls to secondary CNI plugins such as Calico, macvlan, etc.
The choice is defined with the variable `kube_network_plugin`. There is also an
option to leverage built-in cloud provider networking instead.
The Kubespray Project is released on an as-needed basis. The process is as follows:
1. An issue is proposing a new release with a changelog since the last release
2. At least one of the [OWNERS](OWNERS) must LGTM this release
3. An OWNER runs `git tag -s $VERSION` and inserts the changelog and pushes the tag with `git push $VERSION`
4.The release issue is closed
5.An announcement email is sent to `kubernetes-dev@googlegroups.com` with the subject `[ANNOUNCE] Kubespray $VERSION is released`
2. At least one of the [approvers](OWNERS_ALIASES) must approve this release
3. An approver creates [new release in GitHub](https://github.com/kubernetes-sigs/kubespray/releases/new) using a version and tag name like `vX.Y.Z` and attaching the release notes
4.An approver creates a release branch in the form `release-vX.Y`
5.The corresponding version of [quay.io/kubespray/kubespray:vX.Y.Z](https://quay.io/repository/kubespray/kubespray) docker image is built and tagged
6. The `KUBESPRAY_VERSION` variable is updated in `.gitlab-ci.yml`
7. The release issue is closed
8. An announcement email is sent to `kubernetes-dev@googlegroups.com` with the subject `[ANNOUNCE] Kubespray $VERSION is released`
## Major/minor releases, merge freezes and milestones
* Kubespray does not maintain stable branches for releases. Releases are tags, not
branches, and there are no backports. Therefore, there is no need for merge
freezes as well.
* Kubespray maintains one branch for major releases (vX.Y). Minor releases are available only as tags.
* Security patches and bugs might be backported.
* Fixes for major releases (vX.x.0) and minor releases (vX.Y.x) are delivered
via maintenance releases (vX.Y.Z) and assigned to the corresponding open
As contributors and maintainers of this project, and in the interest of fostering
an open and welcoming community, we pledge to respect all people who contribute
through reporting issues, posting feature requests, updating documentation,
submitting pull requests or patches, and other activities.
We are committed to making participation in this project a harassment-free experience for
everyone, regardless of level of experience, gender, gender identity and expression,
sexual orientation, disability, personal appearance, body size, race, ethnicity, age,
religion, or nationality.
Examples of unacceptable behavior by participants include:
* The use of sexualized language or imagery
* Personal attacks
* Trolling or insulting/derogatory comments
* Public or private harassment
* Publishing other's private information, such as physical or electronic addresses,
without explicit permission
* Other unethical or unprofessional conduct.
Project maintainers have the right and responsibility to remove, edit, or reject
comments, commits, code, wiki edits, issues, and other contributions that are not
aligned to this Code of Conduct. By adopting this Code of Conduct, project maintainers
commit themselves to fairly and consistently applying these principles to every aspect
of managing this project. Project maintainers who do not follow or enforce the Code of
Conduct may be permanently removed from the project team.
This code of conduct applies both within project spaces and in public spaces
when an individual is representing the project or its community.
Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by contacting a Kubernetes maintainer, Sarah Novotny <sarahnovotny@google.com>, and/or Dan Kohn <dan@linuxfoundation.org>.
This Code of Conduct is adapted from the Contributor Covenant
(http://contributor-covenant.org), version 1.2.0, available at
http://contributor-covenant.org/version/1/2/0/
### Kubernetes Events Code of Conduct
Kubernetes events are working conferences intended for professional networking and collaboration in the
Kubernetes community. Attendees are expected to behave according to professional standards and in accordance
with their employer's policies on appropriate workplace behavior.
While at Kubernetes events or related social networking opportunities, attendees should not engage in
discriminatory or offensive speech or actions regarding gender, sexuality, race, or religion. Speakers should
be especially aware of these concerns.
The Kubernetes team does not condone any statements by speakers contrary to these standards. The Kubernetes
team reserves the right to deny entrance and/or eject from an event (without refund) any individual found to
be engaging in discriminatory or offensive speech or actions.
Please bring any concerns to the immediate attention of Kubernetes event staff.
# Due to some Azure limitations (ex:- Storage Account's name must be unique),
# Due to some Azure limitations (ex:- Storage Account's name must be unique),
# this name must be globally unique - it will be used as a prefix for azure components
cluster_name: example
@@ -7,6 +7,10 @@ cluster_name: example
# node that can be used to access the masters and minions
use_bastion: false
# Set this to a preferred name that will be used as the first part of the dns name for your bastotion host. For example: k8s-bastion.<azureregion>.cloudapp.azure.com.
# This is convenient when exceptions have to be configured on a firewall to allow ssh to the given bastion host.
# bastion_domain_prefix: k8s-bastion
number_of_k8s_masters: 3
number_of_k8s_nodes: 3
@@ -20,7 +24,8 @@ admin_username: devops
admin_password: changeme
# MAKE SURE TO CHANGE THIS TO YOUR PUBLIC KEY to access your azure machines
MetalLB hooks into your Kubernetes cluster, and provides a network load-balancer implementation. In short, it allows you to create Kubernetes services of type “LoadBalancer” in clusters that don’t run on a cloud provider, and thus cannot simply hook into paid products to provide load-balancers.
```
This playbook aims to automate [this](https://metallb.universe.tf/concepts/layer2/). It deploys MetalLB into kubernetes and sets up a layer 2 loadbalancer.
## Install
```
Defaults can be found in contrib/metallb/roles/provision/defaults/main.yml. You can override the defaults by copying the contents of this file to somewhere in inventory/mycluster/group_vars such as inventory/mycluster/groups_vars/k8s-cluster/addons.yml and making any adjustments as required.
@@ -6,16 +6,16 @@ You can either deploy using Ansible on its own by supplying your own inventory f
In the same directory of this ReadMe file you should find a file named `inventory.example` which contains an example setup. Please note that, additionally to the Kubernetes nodes/masters, we define a set of machines for GlusterFS and we add them to the group `[gfs-cluster]`, which in turn is added to the larger `[network-storage]` group as a child group.
Change that file to reflect your local setup (adding more machines or removing them and setting the adequate ip numbers), and save it to `inventory/k8s_gfs_inventory`. Make sure that the settings on `inventory/group_vars/all.yml` make sense with your deployment. Then execute change to the kubespray root folder, and execute (supposing that the machines are all using ubuntu):
Change that file to reflect your local setup (adding more machines or removing them and setting the adequate ip numbers), and save it to `inventory/sample/k8s_gfs_inventory`. Make sure that the settings on `inventory/sample/group_vars/all.yml` make sense with your deployment. Then execute change to the kubespray root folder, and execute (supposing that the machines are all using ubuntu):
If your machines are not using Ubuntu, you need to change the `--user=ubuntu` to the correct user. Alternatively, if your Kubernetes machines are using one OS and your GlusterFS a different one, you can instead specify the `ansible_ssh_user=<correct-user>` variable in the inventory file that you just created, for each machine/VM:
@@ -21,7 +21,7 @@ You can specify a `default_release` for apt on Debian/Ubuntu by overriding this
glusterfs_ppa_use: yes
glusterfs_ppa_version: "3.5"
For Ubuntu, specify whether to use the official Gluster PPA, and which version of the PPA to use. See Gluster's [Getting Started Guide](http://www.gluster.org/community/documentation/index.php/Getting_started_install) for more info.
For Ubuntu, specify whether to use the official Gluster PPA, and which version of the PPA to use. See Gluster's [Getting Started Guide](https://docs.gluster.org/en/latest/Quick-Start-Guide/Quickstart/) for more info.
- name:Ensure Gluster brick and mount directories exist.
file:"path={{ item }} state=directory mode=0775"
with_items:
- "{{ gluster_brick_dir }}"
- "{{ gluster_mount_dir }}"
- "{{ gluster_brick_dir }}"
- "{{ gluster_mount_dir }}"
- name:Configure Gluster volume.
- name:Configure Gluster volume with replicas
gluster_volume:
state:present
name:"{{ gluster_brick_name }}"
brick:"{{ gluster_brick_dir }}"
replicas:"{{ groups['gfs-cluster'] | length }}"
cluster:"{% for item in groups['gfs-cluster'] -%}{{ hostvars[item]['ip']|default(hostvars[item].ansible_default_ipv4['address']) }}{% if not loop.last %},{% endif %}{%- endfor %}"
host:"{{ inventory_hostname }}"
force:yes
state:present
name:"{{ gluster_brick_name }}"
brick:"{{ gluster_brick_dir }}"
replicas:"{{ groups['gfs-cluster'] | length }}"
cluster:"{% for item in groups['gfs-cluster'] -%}{{ hostvars[item]['ip']|default(hostvars[item].ansible_default_ipv4['address']) }}{% if not loop.last %},{% endif %}{%- endfor %}"
host:"{{ inventory_hostname }}"
force:yes
run_once:true
when:groups['gfs-cluster']|length > 1
- name:Configure Gluster volume without replicas
gluster_volume:
state:present
name:"{{ gluster_brick_name }}"
brick:"{{ gluster_brick_dir }}"
cluster:"{% for item in groups['gfs-cluster'] -%}{{ hostvars[item]['ip']|default(hostvars[item].ansible_default_ipv4['address']) }}{% if not loop.last %},{% endif %}{%- endfor %}"
when:inventory_hostname == groups['kube-master'][0] and groups['gfs-cluster'] is defined and hostvars[groups['gfs-cluster'][0]].gluster_disk_size_gb is defined
# Deploy Heketi/Glusterfs into Kubespray/Kubernetes
This playbook aims to automate [this](https://github.com/heketi/heketi/blob/master/docs/admin/install-kubernetes.md) tutorial. It deploys heketi/glusterfs into kubernetes and sets up a storageclass.
## Client Setup
Heketi provides a CLI that provides users with a means to administer the deployment and configuration of GlusterFS in Kubernetes. [Download and install the heketi-cli](https://github.com/heketi/heketi/releases) on your client machine.
## Install
Copy the inventory.yml.sample over to inventory/sample/k8s_heketi_inventory.yml and change it according to your setup.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.