Compare commits

..

447 Commits

Author SHA1 Message Date
Matthew Mosesohn
acae0fe4a3 Merge pull request #1205 from holser/resolv_updates
Refactoring resolv.conf
2017-04-05 14:22:52 +03:00
Sergii Golovatiuk
2670eefcd4 Refactoring resolv.conf
- Renaming templates for netchecker
- Add dnsPolicy: ClusterFirstWithHostNet to kube-proxy

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-04-05 09:28:01 +02:00
Matthew Mosesohn
c0cae9e8a0 Merge pull request #1204 from mattymo/resolvconf-nodes
Restart kubelet when updating /etc/resolv.conf on all k8s nodes
2017-04-04 22:03:44 +03:00
Matthew Mosesohn
f8cf6b4f7c Merge pull request #1186 from holser/resolv_conf
Set ClusterFirstWithHostNet for Pods with hostnetwork: true
2017-04-04 20:49:55 +03:00
Matthew Mosesohn
a29182a010 Restart kubelet when updating /etc/resolv.conf on all k8s nodes 2017-04-04 20:43:47 +03:00
Sergii Golovatiuk
1cfe0beac0 Set ClusterFirstWithHostNet for Pods with hostnetwork: true
In kubernetes 1.6 ClusterFirstWithHostNet was added as an option. In
accordance to it kubelet will generate resolv.conf based on own
resolv.conf. However, this doesn't create 'options', thus the proper
solution requires some investigation.

This patch sets the same resolv.conf for kubelet as host

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-04-04 16:34:13 +02:00
Matthew Mosesohn
798f90c4d5 Merge pull request #1153 from mattymo/graceful_drain
Move graceful upgrade test to Ubuntu canal HA, adjust drain
2017-04-04 17:33:53 +03:00
Matthew Mosesohn
fac4334950 Merge pull request #1201 from mattymo/configurable_failure
Make any_errors_fatal configurable
2017-04-04 15:51:56 +03:00
Matthew Mosesohn
f8d44a8a88 Merge pull request #1200 from mattymo/issue1190
Fix multiline condition for k8s check certs
2017-04-04 15:48:05 +03:00
Matthew Mosesohn
1136a94a6e Merge pull request #1191 from justindowning/patch-1
pin ansible to version 2.2.1.0
2017-04-04 13:42:32 +03:00
Matthew Mosesohn
fd20e0de90 Wait for container creation in check network test 2017-04-04 13:12:24 +03:00
Matthew Mosesohn
a1150dc334 Make any_errors_fatal configurable
Useful at scale when 1 or 2 noes my fail and you can proceed with
the majority and go back and fix the others later.
2017-04-04 12:52:47 +03:00
Matthew Mosesohn
b4d06ff8dd Add /var/lib/cni to kubelet
Necessary to persist this directory for host-local IPAM used by Canal
Add pre-upgrade task to copy /var/lib/cni out of old kubelet.
2017-04-03 19:38:24 +03:00
Matthew Mosesohn
7581705007 Merge pull request #1185 from intelsdi-x/hostname
Use hostname module to set hostname, and do it for all Os not only Co…
2017-04-03 19:01:12 +03:00
Matthew Mosesohn
5a5707159a Fix multiline condition for k8s check certs
Fixes #1190
2017-04-03 17:44:55 +03:00
Matthew Mosesohn
742a1681ce Merge pull request #1166 from rogerwelin/master
add iptables --flush to reset role
2017-04-03 17:25:10 +03:00
Matthew Mosesohn
fba9b9cb65 Merge pull request #1182 from artem-panchenko/bumpCalicoPolicyControllerVersion
Bump calico policy controller version
2017-04-03 17:21:52 +03:00
Paweł Skrzyński
61b2d7548a Use hostname module to set hostname, and do it for all Os not only CoreOS 2017-04-03 15:09:33 +02:00
Matthew Mosesohn
80828a7c77 use etcd2 when upgrading unless forced 2017-04-03 15:07:42 +03:00
Matthew Mosesohn
f5af86c9d5 Merge pull request #1194 from adidenko/fix-sync_certs
Fix multiline when condition in sync_certs task
2017-03-31 17:39:40 +03:00
Aleksandr Didenko
58acbe7caf Fix multiline when condition in sync_certs task
Folded style in multiline 'when' condition causes error with
unexpected ident. Changing it to literal style should fix
the issue.

Closes #1190
2017-03-30 22:21:04 +02:00
Spencer Smith
355b92d7ba Merge pull request #1170 from jlothian/atomic-docker-network
1169 - fix docker systemd unit
2017-03-30 13:13:28 -07:00
Matthew Mosesohn
d42e4f2344 Update .gitlab-ci.yml 2017-03-30 12:19:15 +04:00
Justin Downing
907e43b9d5 pin ansible to version 2.2.1.0
ansible 2.2.2.0 has an [issue]() that causes problems for kargo:

```
(env) kargo ᐅ env/bin/ansible-playbook upgrade-cluster.yml 
ERROR! Unexpected Exception: 'Host' object has no attribute 'remove_group'
```

Pinning ansible to 2.2.1.0 resolved this for me.
2017-03-29 21:40:34 -04:00
Matthew Mosesohn
fb467df47c fix etcd restart 2017-03-29 23:22:49 +04:00
Matthew Mosesohn
48beef25fa delete master containers forcefully 2017-03-27 19:08:22 +03:00
Matthew Mosesohn
a3f568fc64 restart scheduler and controller-manager too 2017-03-27 13:51:35 +03:00
Matthew Mosesohn
57ee304260 ensure post-upgrade purge ones only once 2017-03-27 13:28:37 +03:00
Matthew Mosesohn
0794a866a7 switch debian8-canal-ha to ubuntu 2017-03-27 13:28:37 +03:00
Matthew Mosesohn
49e4d344da move network plugins out of grouped upgrades 2017-03-27 13:28:37 +03:00
Matthew Mosesohn
21a9dea99f move kubernetes-apps/network-plugin back to master role 2017-03-27 13:28:37 +03:00
Matthew Mosesohn
6e505c0c3f Fix delegate tasks for kubectl and etcdctl 2017-03-27 13:28:37 +03:00
Matthew Mosesohn
e9a294fd9c Significantly reduce memory requirements
Canal runs more pods and upgrades need a bit of extra
room to load new pods in and get the old ones out.
2017-03-27 13:28:37 +03:00
Matthew Mosesohn
44d851d5bb Only cordon Ready nodes 2017-03-27 13:28:37 +03:00
Matthew Mosesohn
5ed03ce7f0 Use checksum of dnsmasq config to trigger updates of dnsmasq
Allows config changes made by Ansible to restart dnsmasq deployment
2017-03-27 13:28:37 +03:00
Matthew Mosesohn
c1b9660ec8 Move graceful upgrade test to debian canal HA, adjust drain
Graceful upgrades require 3 nodes
Drain now has a command timeout of 40s
2017-03-27 13:28:37 +03:00
Matthew Mosesohn
c2c334d22f Merge pull request #1181 from holser/refactor_etcd
Refactor etcd role
2017-03-27 13:05:35 +03:00
Antoine Legrand
ed5c848473 Merge pull request #1175 from zoidbergwill/patch-1
Fix markdown of heading in README
2017-03-27 09:33:43 +02:00
Sergii Golovatiuk
f144fd1ed3 Refactor etcd role
- Run docker run from script rather than directly from systemd target
- Refactoring styling/templates

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-03-24 12:34:15 +01:00
Artem Panchenko
e96557f410 Bump calico policy controller version
Latest released version of kube-policy-controller
contains important bug fixes and should be used
by default.
2017-03-24 12:13:09 +02:00
Antoine Legrand
ac96d5ccf0 Merge pull request #1176 from zoidbergwill/patch-2
Update roadmap.md
2017-03-23 12:05:35 +01:00
Matthew Mosesohn
b2af19471e Merge pull request #1177 from rutsky/replace-nbsp
replace non-breakable space with regular space
2017-03-23 12:59:45 +03:00
Matthew Mosesohn
6805d0ff2b Merge pull request #1179 from kubernetes-incubator/missing_defaults
Add missing defaults
2017-03-23 12:16:13 +03:00
Antoine Legrand
6e1de9d820 Add missing defaults 2017-03-23 10:05:34 +01:00
Matthew Mosesohn
d27ca7854f Merge pull request #1161 from VincentS/aws_deployment
Fixes for AWS Terraform Deployment
2017-03-23 11:59:39 +03:00
Vladimir Rutsky
c4e57477fb replace non-breakable space with regular space
Non-brekable space is 0xc2 0xa0 byte sequence in UTF-8.

To find one:

    $ git grep -I -P '\xc2\xa0'

To replace with regular space:

    $ git grep -l -I -P '\xc2\xa0' | xargs sed -i 's/\xc2\xa0/ /g'

This commit doesn't include changes that will overlap with commit f1c59a91a1.
2017-03-23 00:25:01 +03:00
William Martin Stewart
f1c59a91a1 Update roadmap.md 2017-03-22 22:03:06 +02:00
William Martin Stewart
74c573ef04 Update README.md 2017-03-22 22:01:44 +02:00
Matthew Mosesohn
5f082bc0e5 Merge pull request #1172 from mattymo/dnsmasq_upgrade
Use checksum of dnsmasq config to trigger updates of dnsmasq
2017-03-22 18:00:10 +03:00
Matthew Mosesohn
0e3b7127b5 Merge pull request #1167 from mattymo/dnsmasq_when_deploying_master
Change wait for dnsmasq to skip if there are no kube-nodes in play
2017-03-22 17:59:56 +03:00
Roger Welin
f4638c7580 add iptables --flush to reset role 2017-03-22 11:10:24 +01:00
Matthew Mosesohn
8b0b500c89 Use checksum of dnsmasq config to trigger updates of dnsmasq
Allows config changes made by Ansible to restart dnsmasq deployment
2017-03-22 13:03:55 +03:00
Matthew Mosesohn
04746fc4d8 Merge pull request #1163 from mattymo/kvm_setup
Add KVM hypervisor playbook to contrib
2017-03-22 12:31:14 +03:00
Matthew Mosesohn
463ef3f8bc Merge pull request #1168 from mattymo/disable_download_delegate
Disable download_run_once and download_localhost for most CI scenarios
2017-03-22 12:19:24 +03:00
Josh Lothian
5e2f78424f 1169 - fix docker systemd unit
The docker-network environment file masks the new values
put into /etc/systemd/system/docker.service.d/flannel-options.conf
to renumber the docker0 to work correctly with flannel.
2017-03-21 15:22:14 -05:00
Matthew Mosesohn
3889c2e01c Add KVM hypervisor playbook to contrib
Optional Ansible playbook for preparing a host for running Kargo.
This includes creation of a user account, some basic packages,
and sysctl values required to allow CNI networking on a libvirt network.
2017-03-21 19:50:01 +03:00
Matthew Mosesohn
1887e984a0 Change wait for dnsmasq to skip if there are no kube-nodes in play
Also changed unnecessary delay to a max timeout (now defaulting to 1s sleep
between tries)

Also rename play_hosts to ansible_play_hosts
2017-03-21 18:55:22 +03:00
Matthew Mosesohn
a495bbc1db Disable download_run_once and download_localhost for most CI scenarios
This adds time to deployment, so we should only test it sparingly during
daily master.
2017-03-21 16:41:30 +03:00
Matthew Mosesohn
cd429d3654 Merge pull request #1159 from holser/etcd_backup_restore
Backup etcd
2017-03-21 13:07:44 +03:00
Matthew Mosesohn
771aef0b44 Merge pull request #1162 from holser/bump_coreos_ci
Bump CoreOS stable to latest version
2017-03-20 17:45:04 +03:00
Matthew Mosesohn
f7ef452d8a Merge pull request #1160 from mattymo/simpler_idempotency
Make reset check on idempotency check optional
2017-03-20 17:04:51 +03:00
Matthew Mosesohn
0f64f8db90 Merge pull request #1155 from mattymo/helm
Add helm deployment
2017-03-20 17:00:06 +03:00
Sergii Golovatiuk
c04a6254b9 Backup etcd data before restarting etcd
etcd is crucial part of kubernetes cluster. Ansible restarts etcd on
reconfiguration. Backup helps operator to restore cluster manually in
case of any issues.

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-03-20 14:50:52 +01:00
Sergii Golovatiuk
485e17d6ed Bump CoreOS stable to latest version
1298.6.0 fixes some sporadic network issues. It also includes docker
1.12.6 which includes several stability fixes for kubernetes.

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-03-20 14:31:33 +01:00
Vincent Schwarzer
952ab03d2a Fixes for AWS Terraform Deployment 2017-03-20 12:08:17 +01:00
Matthew Mosesohn
bbb524018e Make reset check on idempotency check optional
By default we do not test reset.yml now.
2017-03-20 13:16:58 +03:00
Matthew Mosesohn
859c08620b Merge pull request #1105 from VincentS/aws_deployment
AWS Terraform for Kargo
2017-03-20 12:55:11 +03:00
Antoine Legrand
f6cd42e6e0 Merge pull request #1158 from rutsky/patch-6
limit jinja2 version to <2.9
2017-03-19 23:42:11 +01:00
Vladimir Rutsky
61ee67d612 limit jinja2 version to <2.9
Ansible 2.2.1 requires jinja2<2.9, see <https://github.com/ansible/ansible/blob/v2.2.1.0-1/setup.py#L25>,
but without explicit limiting upper jinja2 version here pip ignores
Ansible requirements and installs latest available jinja2
(pip is not very smart here), which is incompatible with with
Ansible 2.2.1.
With incompatible jinja2 version "ansible-vault create" (and probably other parts)
fails with:
  ERROR! Unexpected Exception: The 'jinja2<2.9' distribution was not found 
  and is required by ansible
This upper limit should be removed in 2.2.2 release, see:
<978311bf3f>
2017-03-20 01:33:08 +03:00
Matthew Mosesohn
939c1def5d Merge pull request #1152 from mattymo/redhat_weave
Fix weave on RHEL deployment
2017-03-19 16:45:20 +03:00
Matthew Mosesohn
b7ab80e8ea Merge pull request #1149 from mattymo/centos-retries
Retry yum/apt/rpm download commands
2017-03-18 11:12:36 +03:00
Matthew Mosesohn
b69d4b0ecc Add helm deployment 2017-03-17 20:24:41 +03:00
Matthew Mosesohn
2f437d7452 Merge pull request #1157 from rutsky/remove-change-k8s-version
remove obsolete script
2017-03-17 20:23:34 +03:00
Vladimir Rutsky
d761216ec1 remove obsolete script
Currently Kubernetes version can be selected using "kube_version" variable.
2017-03-17 20:09:36 +03:00
Matthew Mosesohn
088e9be931 Merge pull request #1156 from rutsky/patch-5
fix jinja package name
2017-03-17 20:08:36 +03:00
Vladimir Rutsky
32ecac6464 fix jinja package name
Jinja 2.* releases are published under `Jinja2` name.
2017-03-17 20:07:49 +03:00
Matthew Mosesohn
7760c3e4aa Retry yum/apt/rpm download commands, fix succeeded filter 2017-03-17 18:56:26 +03:00
Matthew Mosesohn
3cfb76e57f Merge pull request #1146 from mattymo/resolvconf_optimize
Condense resolvconf sources before starting loop
2017-03-17 18:42:32 +03:00
Matthew Mosesohn
e1faeb0f6c Fix weave on RHEL deployment
Reduce retry delay checking weave
Always load br_netfilter module
2017-03-17 18:17:47 +03:00
Matthew Mosesohn
25bff851dd Merge pull request #1136 from adidenko/fix-calico-policy-order
Move calico-policy-controller into separate role
2017-03-17 17:32:14 +03:00
Aleksandr Didenko
3a39904011 Move calico-policy-controller into separate role
By default Calico CNI does not create any network access policies
or profiles if 'policy' is enabled in CNI config. And without any
policies/profiles network access to/from PODs is blocked.

K8s related policies are created by calico-policy-controller in
such case. So we need to start it as soon as possible, before any
real workloads.

This patch also fixes kube-api port in calico-policy-controller
yaml template.

Closes #1132
2017-03-17 11:21:52 +01:00
Matthew Mosesohn
7e1fbfba64 Merge pull request #1147 from mattymo/calico-update
Update calico to 1.1.0-rc8
2017-03-17 13:17:41 +03:00
Matthew Mosesohn
a52064184e Condense resolvconf sources before starting loop 2017-03-17 13:06:56 +03:00
Matthew Mosesohn
b4a1ba828a Merge pull request #1148 from VincentS/patch-1
Fixed Formatting / Ansbile-Playbook Command Upgrade Cluster
2017-03-16 19:55:59 +03:00
Vincent Schwarzer
c8c6105ee2 Fixed Formatting / Ansbile-Playbook Command
- added -b and fixed typo in ansible-playbook command 
- fixed formatting issue
2017-03-16 17:53:48 +01:00
Matthew Mosesohn
0b49eeeba3 Update calico to 1.1.0-rc8
Fixes bug in CentOS/RHEL in felix related to overlayfs driver.
2017-03-16 19:23:36 +03:00
Matthew Mosesohn
b0830f0cd7 Merge pull request #1087 from bradbeam/openstack
Adding openstack domain id
2017-03-16 17:53:14 +03:00
Matthew Mosesohn
565d4a53b0 Merge pull request #1108 from idcrook/issue_1107-docker-versioning
Adding Docker CE 'stable' and 'edge' version packages
2017-03-16 16:32:13 +03:00
Matthew Mosesohn
9624662bf6 Merge pull request #1141 from mattymo/idempotency2
More idempotency fixes
2017-03-16 12:29:42 +03:00
Matthew Mosesohn
8195957461 Merge branch 'master' into idempotency2 2017-03-16 09:29:43 +03:00
Matthew Mosesohn
02fed4a082 Merge pull request #1138 from mattymo/idempotency-fixes
Idempotency fixes for etcd certs and resolvconf tasks
2017-03-16 09:20:28 +03:00
Bogdan Dobrelya
34ecf4ea51 Merge pull request #1109 from pcm32/feature/fixTerraformOS
Restores working order of contrib/terraform/openstack
2017-03-15 17:15:35 +01:00
Matthew Mosesohn
a422ad0d50 More idempotency fixes
Fixed sync_tokens fact
Fixed sync_certs for k8s tokens fact
Disabled register docker images changability
Fixed CNI dir permission
Fix idempotency for etcd pre upgrade checks
2017-03-15 19:06:39 +03:00
Matthew Mosesohn
096d96e344 Merge pull request #1137 from holser/bug/1135
Turn on iptables for flannel
2017-03-15 17:06:42 +03:00
Bogdan Dobrelya
e61310bc89 Merge pull request #1140 from VincentS/jinja28
Added Jinja 2.8 to Docs
2017-03-15 13:18:53 +01:00
Vincent Schwarzer
111ca9584e Added Jinja 2.8 to Docs
Added Jinja 2.8 Requirements to docs and pip requirements file which
is needed to run the current Ansible Playbooks.
2017-03-15 13:11:09 +01:00
Matthew Mosesohn
7d35c4592c Merge pull request #1139 from VincentS/docu_fix
Fix for CoreOS Docu
2017-03-15 15:06:41 +03:00
Vincent Schwarzer
3e8386cbf3 Fixed CoreOS Docu
CoreOS docu was referencing outdated bootstrap playbook that
is now part of kargo itself.
2017-03-15 13:04:01 +01:00
Matthew Mosesohn
4354162067 Merge pull request #1080 from VincentS/Granular_Auth_Control
Granular authentication Control
2017-03-15 13:12:51 +03:00
Matthew Mosesohn
a62a444229 Merge pull request #1117 from mattymo/etcd3-upgrade
Migrate k8s data to etcd3 api store
2017-03-15 12:56:06 +03:00
Matthew Mosesohn
f6b72fa830 Make resolvconf preinstall idempotent 2017-03-15 01:20:13 +04:00
Sergii Golovatiuk
9667e8615f Turn on iptables for flannel
Closes: #1135
Closes: #1026
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-03-14 17:54:55 +01:00
Vincent Schwarzer
026da060f2 Granular authentication Control
It is now possible to deactivate selected authentication methods
(basic auth, token auth) inside the cluster by adding
removing the required arguments to the Kube API Server and generating
the secrets accordingly.

The x509 authentification is currently not optional because disabling it
would affect the kubectl clients deployed on the master nodes.
2017-03-14 16:57:35 +01:00
Matthew Mosesohn
3feab1cb2d Merge pull request #1134 from mattymo/1.6-support
Explicitly set cni-bin-dir
2017-03-14 17:53:08 +03:00
Matthew Mosesohn
804e9a09c0 Migrate k8s data to etcd3 api store
Default backend is now etcd3 (was etcd2).
The migration process consists of the following steps:
* check if migration is necessary
* stop etcd on first etcd server
* run migration script
* start etcd on first etcd server
* stop kube-apiserver until configuration is updated
* update kube-apiserver
* purge old etcdv2 data
2017-03-14 17:50:20 +03:00
Matthew Mosesohn
4c6829513c Fix etcd idempotency 2017-03-14 17:23:29 +03:00
Matthew Mosesohn
4038954f96 Merge pull request #1078 from VincentS/oidc_support
Added Support for OpenID Connect Authentication
2017-03-14 12:07:21 +03:00
Matthew Mosesohn
52a6dd5427 Explicitly set cni-bin-dir 2017-03-13 20:13:21 +03:00
Matthew Mosesohn
c301dd5d94 Merge pull request #1118 from mattymo/noderolelabels
Add node labels in kubelet
2017-03-13 19:04:21 +03:00
Antoine Legrand
7cb7eee29d Merge pull request #1116 from kubernetes-incubator/contrib_docs
Reference external documentation sources
2017-03-07 13:33:25 +01:00
David Crook
a52e1069ce updated debian and ubuntu package names based on testing
docker-ce is not the .deb package until the repositories are switched over to new "downloads" docker webserver
2017-03-06 16:54:39 -07:00
David Crook
a8e5002aeb removed irrelevant comments 2017-03-06 16:02:53 -07:00
David Crook
c515a351c6 Merge branch 'master' into issue_1107-docker-versioning 2017-03-06 16:00:31 -07:00
Antoine Legrand
7777b30693 Merge pull request #1120 from bradbeam/fixtags
Removing cloud_provider tag to fix scenario where cloud_provider is n…
2017-03-06 19:00:41 +01:00
Brad Beam
d04fbf3f78 Removing cloud_provider tag to fix scenario where cloud_provider is not defined 2017-03-06 10:52:38 -06:00
Matthew Mosesohn
54207877bd Add node labels in kubelet
Related-issue: https://github.com/kubernetes/community/issues/300
Upgraded nodes do not obtain labels automatically.
See https://github.com/kubernetes/kubernetes/pull/29459 for more details.
2017-03-06 17:18:42 +03:00
Vincent Schwarzer
3c6b1480b8 Rewrote AWS Terraform for Kargo
Rewrote AWS Terraform deployment for AWS Kargo. It supports now
multiple Availability Zones, AWS Loadbalancer for Kubernetes API,
Bastion Host, ...

For more information see README
2017-03-06 12:52:02 +01:00
Vincent Schwarzer
b075960e3b Added Support for OpenID Connect Authentication
To use OpenID Connect Authentication beside deploying an OpenID Connect
Identity Provider it is necesarry to pass additional arguments to the Kube API Server.
These required arguments were added to the kube apiserver manifest.
2017-03-06 12:40:35 +01:00
Antoine Legrand
85596c2610 Merge pull request #1045 from bradbeam/vsphere
Adding vsphere cloud provider support
2017-03-06 12:34:05 +01:00
Antoine Legrand
0613e3c24d Reference external documentation sources 2017-03-06 12:25:54 +01:00
Antoine Legrand
ee5f009b95 Merge pull request #1112 from mattymo/skip_vault_if_disabled
Disable vault role properly on ansible 2.2.0
2017-03-06 11:27:53 +01:00
Antoine Legrand
d76816d043 Merge pull request #1115 from mattymo/etcd-phases
Remove standalone etcd specific play, cleanup host mode
2017-03-06 11:21:08 +01:00
Matthew Mosesohn
45274560ec Disable vault role properly on ansible 2.2.0
when condition does not seem to work correctly at playbook
level for ansible 2.2.0.
2017-03-05 00:43:01 +04:00
Matthew Mosesohn
02a8e78902 Remove standalone etcd specific play, cleanup host mode
Now etcd role can optionally disable etcd cluster setup for faster
deployment when it is combined with etcd role.
2017-03-04 00:34:26 +04:00
Matthew Mosesohn
8f3d9e93ce Merge pull request #1111 from mattymo/use_find_for_certs
Use find module for checking for certificates
2017-03-03 20:08:33 +03:00
Matthew Mosesohn
a244aca6a4 Merge pull request #1113 from VincentS/AWS_IAM_PROFILES
Added Missing AWS IAM Profiles and Policies
2017-03-03 17:35:55 +03:00
Vincent Schwarzer
5ae85b9de5 Added Missing AWS IAM Profiles and Policies
The AWS IAM profiles and policies required to run Kargo on AWS
are no longer hosted in the kubernetes main repo since kube-up got
deprecated. Hence we have to move the files into the kargo repository.
2017-03-03 15:30:07 +01:00
Matthew Mosesohn
d176818c44 Use find module for checking for certificates
Also generate certs only when absent on master (rather than
when absent on target node)
2017-03-03 16:21:01 +03:00
Bogdan Dobrelya
aeec0f9a71 Merge pull request #1071 from vijaykatam/atomic_host
Add support for atomic host
2017-03-03 13:03:59 +01:00
Matthew Mosesohn
08a02af833 Merge pull request #1075 from VincentS/loadbalancer_aws
Possibility to add Loadbalancers without static IP (e.g. AWS ELB) #1074
2017-03-03 14:07:22 +03:00
Pablo Moreno
cf26585cff Restores working order of contrib/terraform/openstack, includes vault group and avoids group_vars/k8s-cluster.yml 2017-03-02 23:58:07 +00:00
David Crook
3f4a375ac4 first pass at adding 'stable' and 'edge' version packages
- Only have ubuntu to test on
  - fedora and redhat are placeholders/guesses
  - the "old" package repositories seem to have the "new" CE version which is `1.13.1` based
- `docker-ce` looks like it is named as a backported `docker-engine` package in some
  places

- Did not change the `defaults` version anywhere, so should work as before
- Did not point to new package repositories, as existing ones have the new packages.
2017-03-02 13:48:09 -07:00
Matthew Mosesohn
cc632f2713 Merge pull request #1099 from rutsky/patch-4
fix inline verbatim blocks formatting in markdown
2017-03-02 17:46:52 +03:00
Matthew Mosesohn
5ebc9a380c Merge pull request #1060 from holser/etcdv3
Allow to specify etcd backend for kube-api
2017-03-02 17:24:09 +03:00
Matthew Mosesohn
6453650895 Merge pull request #1093 from mattymo/scaledns
Add autoscalers for dnsmasq and kubedns
2017-03-02 16:58:56 +03:00
Matthew Mosesohn
9cb12cf250 Add autoscalers for dnsmasq and kubedns
By default kubedns and dnsmasq scale when installed.
Dnsmasq is no longer a daemonset. It is now a deployment.
Kubedns is no longer a replicationcluster. It is now a deployment.
Minimum replicas is two (to enable rolling updates).

Reduced memory erquirements for dnsmasq and kubedns
2017-03-02 13:44:22 +03:00
Vincent Schwarzer
68e8d74545 Changes based on feedback (additional ansible checks) 2017-03-02 11:04:10 +01:00
Vincent Schwarzer
fc054e21f6 Modified how adding LB for the Kube API is handled (AWS)
Until now it was not possible to add an API Loadbalancer
without an static IP Address. But certain Loadbalancers
like AWS Elastic Loadbalanacer dontt have an fixed IP address.
With this commit it is possible to add these kind of Loadbalancers
to the Kargo deployment.
2017-03-02 11:04:10 +01:00
Matthew Mosesohn
3256f4bc0f Merge pull request #1103 from mattymo/upgradesyntax
Add upgrade-cluster and reset playbooks to syntax check
2017-03-02 12:41:10 +03:00
Matthew Mosesohn
0e9ad8f2c7 Merge pull request #1100 from retr0h/host-vars
Added host_vars to gitignore
2017-03-02 12:32:22 +03:00
Matthew Mosesohn
efbb5b2db3 Merge pull request #1101 from retr0h/docker-1.13.1
Use docker-engine 1.13.1
2017-03-02 12:31:58 +03:00
Matthew Mosesohn
85ed4157ff Add upgrade-cluster and reset playbooks to syntax check 2017-03-02 09:37:16 +04:00
John Dewey
a43569c8a5 Use docker-engine 1.13.1
The default version of Docker was switched to 1.13 in #1059.  This
change also bumped ubuntu from installing docker-engine 1.13.0 to
1.13.1.  This PR updates os families which had 1.13 defined, but
were using 1.13.0.

The impetus for this change is an issue running tiller 1.2.3 on
docker 1.13.0.  See discussion [1][2].

[1] https://github.com/kubernetes/helm/issues/1838
[2] https://github.com/kubernetes-incubator/kargo/pull/1100
2017-03-01 12:53:39 -08:00
John Dewey
e771d0ea39 Updated gitignore pattern per review 2017-03-01 12:45:24 -08:00
John Dewey
9073eba405 Added host_vars to gitignore
Since inventory ships with kargo, the ability to change functionality
without having a dirty git index is nice.  An example, we wish to change
is the version of docker deployed to our CentOS systems.  Due to an issue
with tiller and docker 1.13, we wish to deploy docker 1.12.  Since this
change does not belong in Kargo, we wish to locally override the docker
version, until the issue is sorted.
2017-03-01 11:08:35 -08:00
Matthew Mosesohn
a5cd73d047 Merge pull request #959 from galthaus/host-mode-restart
Restart kube-controller for host_resolvconf mode
2017-03-01 20:54:21 +03:00
Vijay Katam
a0b1eda1d0 Add support for atomic host
Updates based on feedback

Simplify checks for file exists

remove invalid char

Review feedback. Use regular systemd file.

Add template for docker systemd atomic
2017-03-01 09:38:19 -08:00
Vladimir Rutsky
ad80e09ac5 fix inline verbatim blocks formatting in markdown 2017-03-01 17:50:28 +04:00
Antoine Legrand
77e5171679 Merge pull request #1076 from VincentS/etcd_openssl_count_fix
Fixed counter in ETCD Openssl.conf
2017-03-01 14:17:27 +01:00
Bogdan Dobrelya
0c66418dad Merge pull request #1090 from artem-panchenko/calicoAcceptHostEndpointConnections
Allow connections from pods to local endpoints
2017-03-01 13:37:05 +01:00
Bogdan Dobrelya
45a9eac7d2 Merge pull request #1097 from kubernetes-incubator/mattymo-patch-1
Fix vault role in upgrade-cluster.yml
2017-03-01 09:21:02 +01:00
Matthew Mosesohn
838adf7475 Fix vault role in upgrade-cluster.yml 2017-03-01 11:19:38 +03:00
Artem Panchenko
fa05d15093 Allow connections from pods to local endpoints
By default Calico blocks traffic from endpoints
to the host itself by using an iptables DROP
action. It could lead to a situation when service
has one alive endpoint, but pods which run on
the same node can not access it. Changed the action
to RETURN.
2017-03-01 09:21:02 +02:00
Antoine Legrand
1122740bd7 Merge pull request #1094 from retr0h/vagrant-flannel
Ensure vagrant uses flannel
2017-03-01 00:07:24 +01:00
John Dewey
f877278075 Ensure vagrant uses flannel
The Vagrantfile is setup to use flannel.  The default network
was changed to Calico (#1031).  However, the Vagrantfile was
not updated to reflect this.  Ensuring the Vagrantfile remains
functional on master, until someone decides to make it work
with Calico.
2017-02-28 13:31:28 -08:00
Matthew Mosesohn
cbaa6abdd0 Merge pull request #1066 from bradbeam/rkt-kubelet-cloudprovider
Adding KUBELET_CLOUDPROVIDER to kubelet.rkt.service
2017-02-28 20:02:56 +03:00
Matthew Mosesohn
76a4803292 Merge pull request #1084 from mattymo/fixubunturktjob
Remove upgrade from the ubuntu-rkt-sep CI job
2017-02-28 20:02:05 +03:00
Antoine Legrand
b286b2eb31 Merge pull request #1083 from holser/api_port
Change kube-api default port from 443 to 6443
2017-02-28 17:57:35 +01:00
Sergii Golovatiuk
295103adc0 Allow to specify etcd backend for kube-api
Kubernetes project is about to set etcdv3 as default storage engine in
1.6. This patch allows to specify particular backend for
kube-apiserver. User may force the option to etcdv3 for new environment.
At the same time if the environment uses v2 it will continue uses it
until user decides to upgrade to v3.

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-02-28 17:13:22 +01:00
Sergii Golovatiuk
d31c040dc0 Change kube-api default port from 443 to 6443
Operator can specify any port for kube-api (6443 default) This helps in
case where some pods such as Ingress require 443 exclusively.

Closes: 820
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-02-28 15:45:35 +01:00
Brad Beam
bfff06d402 Adding KUBELET_CLOUDPROVIDER to kubelet.rkt.service 2017-02-28 06:29:35 -06:00
Matthew Mosesohn
21d3d75827 Merge pull request #1086 from bradbeam/lowermem
Lower default memory requests
2017-02-28 13:37:28 +03:00
Matthew Mosesohn
2c3538981a Merge pull request #1077 from holser/bug/1073
Make etcd data dir configurable.
2017-02-28 13:19:20 +03:00
Brad Beam
30a9899262 Making openstack domain name optional 2017-02-27 21:19:27 -06:00
Xavier Lange
dd10b8a27c Bug fix: support kilo's keystone requirement for domain-name, extracts from ENV var 2017-02-27 21:18:30 -06:00
Brad Beam
dbf13290f5 Updating vsphere cloud provider support 2017-02-27 15:08:04 -06:00
Sergii Golovatiuk
f9ff93c606 Make etcd data dir configurable.
Closes: #1073
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-02-27 21:35:51 +01:00
Jan Jungnickel
df476b0088 Initial support for vsphere as cloud provider 2017-02-27 12:51:41 -06:00
Brad Beam
56664b34a6 Lower default memory requests
This is to address out of memory issues on CI as well as help
fit deployments for people starting out with kargo on smaller
machines
2017-02-27 10:53:43 -06:00
Matthew Mosesohn
efb45733de Remove upgrade from the ubuntu-rkt-sep CI job 2017-02-27 18:16:22 +03:00
Vincent Schwarzer
0cbc3d8df6 Fixed counter in ETCD Openssl.conf
When a apiserver_loadbalancer_domain_name is added to the Openssl.conf
the counter gets not increased correctly. This didnt seem to have an
effect at the current kargo version.
2017-02-27 12:01:09 +01:00
Bogdan Dobrelya
27b4e61c9f Merge pull request #946 from neith00/master
Using the command module instead of raw
2017-02-27 10:59:53 +01:00
Bogdan Dobrelya
069606947c Merge pull request #1063 from bogdando/fix
Align LB defaults with the HA docs
2017-02-27 10:14:42 +01:00
Matthew Mosesohn
6ae6b7cfcd Merge pull request #1072 from gkopylov/fix_doc_issue
Fix cluster.yml file extension in docs
2017-02-26 15:12:45 +03:00
Kopylov German
d197ce230f Fix cluster.yml file extension in docs 2017-02-26 13:42:52 +03:00
Matthew Mosesohn
c6cb0d3984 Merge pull request #1069 from holser/increase_ssl_ttl
Increase SSL TTL to 3650 days
2017-02-25 10:47:30 +03:00
Sergii Golovatiuk
00cfead9bb Increase SSL TTL to 3650 days
In real scenarios 365 days is short period of time. 3650 days is good
enough for long running k8s environments
2017-02-24 15:38:13 +01:00
Antoine Legrand
20b1e4db0b Merge pull request #1068 from holser/uncomment_all.yml
Uncomment one key/value in all.yml
2017-02-24 12:54:51 +01:00
Sergii Golovatiuk
a098a32f7d Uncomment one key/value in all.yml
all.yaml shouldn't be empty otherwise ansible won't be able to merge 2
dicts.

Related bug: ansible/issues/21889
2017-02-24 12:25:45 +01:00
Antoine Legrand
9ee9a1033f Merge pull request #1067 from kubernetes-incubator/ant31-patch-2
Uncommented group_vars variables
2017-02-24 11:45:17 +01:00
Antoine Legrand
eb904668b2 Uncommented group_vars variables 2017-02-24 10:54:25 +01:00
Bogdan Dobrelya
75b69876a3 Merge pull request #1064 from kubernetes-incubator/rework_vars
Add default var role
2017-02-23 21:48:23 +01:00
Antoine Legrand
08d9d24320 Add subnet var in tests 2017-02-23 15:14:28 +01:00
Antoine Legrand
c7d61af332 Comment all variables in group_vars 2017-02-23 14:02:57 +01:00
Antoine Legrand
5f7607412b Add default var role 2017-02-23 12:07:17 +01:00
Antoine Legrand
403fea39f7 Merge pull request #829 from bogdando/opts
Rework group/role vars
2017-02-23 10:39:43 +01:00
Bogdan Dobrelya
f2a4619c57 Align LB defaults with the HA docs
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-02-23 10:32:44 +01:00
Bogdan Dobrelya
712872efba Rework inventory all by real groups' vars
* Leave all.yml to keep only optional vars
* Store groups' specific vars by existing group names
* Fix optional vars casted as mandatory (add default())
* Fix missing defaults for an optional IP var
* Relink group_vars for terraform to reflect changes

Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-02-23 09:43:42 +01:00
Matthew Mosesohn
8cbf3fe5f8 Merge pull request #1020 from mattymo/synthscale
Add synthetic scale deployment mode
2017-02-22 19:15:46 +03:00
Matthew Mosesohn
02137f8cee Merge pull request #1059 from holser/docker_iptables
iptables switch for docker
2017-02-22 08:23:58 +03:00
Matthew Mosesohn
43ea281a7f Merge pull request #1061 from ivan4th/fix-shell-vars
Fix shell special vars
2017-02-22 08:23:44 +03:00
Ivan Shvedunov
0006e5ab45 Fix shell special vars 2017-02-21 22:22:40 +03:00
Matthew Mosesohn
d821448e2f Merge branch 'master' into synthscale 2017-02-21 22:17:43 +03:00
Sergii Golovatiuk
3bd46f7ac8 Switch docker to 1.13
- Remove variable dup for Ubuntu
- Update Docker to 1.13
2017-02-21 19:10:34 +01:00
Sergii Golovatiuk
ebf9daf73e Statically disable iptables management for docker
Docker 1.13 changes the behaviour of iptables defaults from allow
to drop. This patch disables docker's iptables management as it was
in Docker 1.12 [1]

[1] https://github.com/docker/docker/pull/28257

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-02-21 19:10:34 +01:00
Matthew Mosesohn
2ba66f0b26 Change coreos-alpha dns mode to host_resolvconf 2017-02-21 18:14:42 +03:00
Matthew Mosesohn
0afadb9149 Merge pull request #1046 from skyscooby/pedantic-syntax-cleanup
Cleanup legacy syntax, spacing, files all to yml
2017-02-21 17:03:16 +03:00
Matthew Mosesohn
19d0159e33 Raise timeout for get netchecker agents 2017-02-21 14:48:25 +03:00
Matthew Mosesohn
d4f15ab402 Merge pull request #1055 from mattymo/etcd-preupgrade-speedup
speed up etcd preupgrade check
2017-02-21 12:51:42 +03:00
Matthew Mosesohn
527e030283 Merge pull request #1058 from holser/update_calico_cni
Update calico-cni to 1.5.6
2017-02-20 23:09:47 +03:00
Matthew Mosesohn
634e6a381c Merge pull request #1043 from rutsky/patch-3
fix typos in azure docs
2017-02-20 20:24:05 +03:00
Matthew Mosesohn
042d094ce7 Merge pull request #1034 from rutsky/fix-openssl-lb-index
fix load balancer DNS name index evaluation in openssl.conf
2017-02-20 20:23:26 +03:00
Matthew Mosesohn
3cc1491833 Merge branch 'master' into pedantic-syntax-cleanup 2017-02-20 20:19:38 +03:00
Matthew Mosesohn
d19e6dec7a speed up etcd preupgrade check 2017-02-20 20:18:10 +03:00
Matthew Mosesohn
6becfc52a8 Merge pull request #1056 from mattymo/k8s153
Update Kubernetes to v1.5.3
2017-02-20 20:13:40 +03:00
Sergii Golovatiuk
a2cbbc5c4f Update calico-cni to 1.5.6
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-02-20 17:14:45 +01:00
Matthew Mosesohn
10173525d8 Update Kubernetes to v1.5.3 2017-02-20 18:14:56 +03:00
Antoine Legrand
ccdb72a422 Merge pull request #1053 from hvnsweeting/master
Update Doc
2017-02-20 10:42:16 +01:00
Hung Nguyen Viet
df96617d3c Only 1 key needed 2017-02-20 14:54:20 +07:00
Antoine Legrand
09aa3e0e79 Merge pull request #1052 from hvnsweeting/master
Put Ansible requirements first
2017-02-20 08:44:16 +01:00
Hung Nguyen Viet
a673e97f02 Put Ansible requirements first
And re-phrase all sentences to passive tense
2017-02-20 14:39:51 +07:00
Matthew Mosesohn
43e86921e0 pin coreos-alpha to 1325 2017-02-19 16:23:35 +03:00
Antoine Legrand
ad58e08a41 Merge pull request #1049 from alop/selinux
Safe disable SELinux
2017-02-19 10:26:01 +01:00
Abel Lopez
0bfc2d0f2f Safe disable SELinux
Sometimes, a sysadmin might outright delete the SELinux rpms and
delete the configuration. This causes the selinux module to fail
with
```
IOError: [Errno 2] No such file or directory: '/etc/selinux/config'\n",
"module_stdout": "", "msg": "MODULE FAILURE"}
```

This simply checks that /etc/selinux/config exists before we try
to set it Permissive.

Update from feedback
2017-02-18 11:54:25 -08:00
Matthew Mosesohn
475a42767a Suppress logging for download image
This generates too much output and during upgrade scenarios
can bring us over the 4mb limit.
2017-02-18 19:10:26 +04:00
Matthew Mosesohn
ce4eefff6a Use first kube-master to check results 2017-02-18 14:11:51 +04:00
Matthew Mosesohn
82b247d1a4 Adapt advanced network checker for scale
Skip nodes not in ansible play (via --limit)
2017-02-18 14:09:57 +04:00
Matthew Mosesohn
a21eb036ee Add no_log to cert tar tasks
This works around 4MB limit for gitlab CI runner.
2017-02-18 14:09:57 +04:00
Matthew Mosesohn
9c1701f2aa Add synthetic scale deployment mode
New deploy modes: scale, ha-scale, separate-scale
Creates 200 fake hosts for deployment with fake hostvars.

Useful for testing certificate generation and propagation to other
master nodes.

Updated test cases descriptions.
2017-02-18 14:09:55 +04:00
Andrew Greenwood
fd17c37feb Regex syntax changes in yml mode 2017-02-17 17:30:39 -05:00
Andrew Greenwood
cde5451e79 Syntax Bugfix 2017-02-17 17:08:44 -05:00
Andrew Greenwood
ca9ea097df Cleanup legacy syntax, spacing, files all to yml
Migrate older inline= syntax to pure yml syntax for module args as to be consistant with most of the rest of the tasks
Cleanup some spacing in various files
Rename some files named yaml to yml for consistancy
2017-02-17 16:22:34 -05:00
Antoine Legrand
b84cc14694 Merge pull request #1029 from mattymo/graceful
Add graceful upgrade process
2017-02-17 21:24:32 +01:00
Vladimir Rutsky
a84175b3b9 fix typo: "infrastructore" 2017-02-17 23:27:38 +04:00
Vladimir Rutsky
438b4e9625 fix typos in azure docs 2017-02-17 21:39:22 +04:00
Matthew Mosesohn
a510e7b8f3 Use gce hostname as inventory name
Calico does not allow renaming hosts
2017-02-17 20:21:58 +03:00
Antoine Legrand
e16ebcad6e Merge pull request #1042 from holser/fix_facts
Fix fact tags
2017-02-17 17:56:29 +01:00
Sergii Golovatiuk
e91e58aec9 Fix fact tags
Ansible playbook fails when tags are limited to "facts,etcd" or to
"facts". This patch allows to run ansible-playbook to gather facts only
that don't require calico/flannel/weave components to be verified. This
allows to run ansible with 'facts,bootstrap-os' or just 'facts' to
gether facts that don't require specific components.

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-02-17 12:32:33 +01:00
Antoine Legrand
3629b9051d Merge pull request #1038 from rutsky/kubelet-mount-var-log
Mount host's /var/log into kubelet container
2017-02-17 10:26:12 +01:00
Antoine Legrand
ef919d963b Merge pull request #1040 from retr0h/vagrant-config
Better control instance sizing
2017-02-17 10:25:09 +01:00
Antoine Legrand
4545114408 Merge pull request #1037 from mattymo/coreos_fix
Fix references to CoreOS and Container Linux by CoreOS
2017-02-17 10:21:14 +01:00
Smaine Kahlouch
9ed32b9dd0 Merge pull request #1036 from rutsky/fix-kibana-default-base-url
fix typo in "kibana_base_url" variable name
2017-02-17 07:03:59 +01:00
John Dewey
45dbe6d542 Better control instance sizing
* Git ignore the user controlled config.rb.
* Ability to better control the number of instances running.
2017-02-16 13:09:34 -08:00
Vladimir Rutsky
bff955ff7e Mount host's /var/log into kubelet container
Kubelet is responsible for creating symlinks from /var/lib/docker to /var/log
to make fluentd logging collector work.
However without using host's /var/log those links are invisible to fluentd.

This is done on rkt configuration too.
2017-02-16 22:31:05 +03:00
Matthew Mosesohn
80c0e747a7 Fix references to CoreOS and Container Linux by CoreOS
Fixes #967
2017-02-16 19:25:17 +03:00
Matthew Mosesohn
617edda9ba Adjust weave daemonset for serial deployment 2017-02-16 18:24:30 +03:00
Vladimir Rutsky
7ab04b2e73 fix typo in "kibana_base_url" variable name
This typo lead to kibana_base_url being undefined and Kibana used
default base URL ("/") which is incorrect with default proxy-based
access.
2017-02-16 18:17:06 +03:00
Antoine Legrand
e89056a614 Merge pull request #1033 from rutsky/reset-confirmation
ask confirmation before running reset.yml playbook
2017-02-16 16:10:58 +01:00
Matthew Mosesohn
97ebbb9672 Add graceful upgrade process
Based on #718 introduced by rsmitty.

Includes all roles and all options to support deployment of
new hosts in case they were added to inventory.

Main difference here is that master role is evaluated first
so that master components get upgraded first.

Fixes #694
2017-02-16 17:18:38 +03:00
Vladimir Rutsky
c02213e4af force reset confirmation in CI 2017-02-16 16:35:01 +03:00
Smaine Kahlouch
73e0aeb4ca Merge pull request #1031 from mattymo/defaultcalico
Change default network plugin to Calico
2017-02-16 14:04:12 +01:00
Vladimir Rutsky
a1ec6f401c fix load balancer DNS name index evaluation in openssl.conf
Looks like OpenSSL still properly handles it, even with duplicated
"DNS.X" items.
2017-02-16 00:16:13 +03:00
Vladimir Rutsky
5337d37a1c ask confirmation before running reset.yml playbook 2017-02-15 21:05:46 +03:00
Matthew Mosesohn
d92d955aeb Merge pull request #985 from rutsky/check-mode-for-shell-commands
set "check_mode: on" for read-only "shell" steps that registers result
2017-02-15 17:53:41 +03:00
Matthew Mosesohn
7ac84d386c Merge pull request #1030 from rutsky/remove-swp
remove temporary file
2017-02-15 17:44:41 +03:00
Vladimir Rutsky
8397baa700 remove temporary file 2017-02-15 17:40:05 +03:00
Matthew Mosesohn
2d65554cb9 Change default network plugin to Calico 2017-02-15 16:15:22 +03:00
Matthew Mosesohn
64e40d471c Merge pull request #1028 from holser/ansible.cfg
Add timings to RECAP output.
2017-02-15 12:41:49 +03:00
Sergii Golovatiuk
c5ea29649b Add timings to RECAP output.
- Starting from version 2.0 ansible has 'callback_whitelist =
  profile_tasks'. It allows to analyze CI to find some time regressions.
- Add skippy to CI's ansible.cfg

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-02-14 18:47:02 +01:00
Antoine Legrand
410438a0e3 Merge pull request #1008 from bradbeam/rkt-proxy
Adding support for proxy w/ rkt kubelet
2017-02-14 17:52:21 +01:00
Spencer Smith
fbaef7e60f specify grace period for draining 2017-02-14 18:51:13 +03:00
Spencer Smith
017a813621 first cut of an upgrade process 2017-02-14 18:51:13 +03:00
Brad Beam
4c891b8bb0 Adding support for proxy w/ rkt kubelet 2017-02-14 08:09:49 -06:00
Matthew Mosesohn
948d9bdadb Merge pull request #1019 from mattymo/issue1011
Update calico to v1.0.2
2017-02-14 14:01:25 +03:00
Matthew Mosesohn
b7258ec3bb Merge pull request #1013 from mattymo/remove_masqerade_all
Disable kube_proxy_masquerade_all
2017-02-14 14:00:29 +03:00
Antoine Legrand
93cb5a5bd6 Merge pull request #1027 from hvnsweeting/master
Multiples doc fixes
2017-02-14 11:39:22 +01:00
Hung Nguyen Viet
d8f46c4410 Highlight important action 2017-02-14 17:18:25 +07:00
Hung Nguyen Viet
d0757ccc5e Fix typo 2017-02-14 17:18:22 +07:00
Antoine Legrand
f4f730bd8a Merge pull request #1025 from holser/bug/961
Install pip on Ubuntu
2017-02-14 10:31:42 +01:00
Matthew Mosesohn
f5e27f1a21 Merge pull request #1021 from holser/remove_deprecated
Replace always_run with check_mode
2017-02-14 11:25:58 +03:00
Matthew Mosesohn
bb6415ddc4 Merge pull request #1015 from holser/rkt_ssl_ca_dirs
Set ssl_ca_dirs for rkt based on fact
2017-02-14 11:25:17 +03:00
Sergii Golovatiuk
2b6179841b Install pip on Ubuntu
- Refactor 'Check if bootstrap is needed' as ansible loop. This allows
  to add new elements easily without refactoring. Add pip to the list.
- Refactor 'Install python 2.x' task to run once if any of rc
  codes != 0. Actually, need_bootstrap is array of hashes, so map will
  allow to get single array of rc statuses. So if status is not zero it
  will be sorted and the last element will be get, converted to bool.

Closes: #961
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-02-13 19:35:13 +01:00
Antoine Legrand
e877cd2874 Merge pull request #1024 from holser/bug/961
Install pip on Ubuntu
2017-02-13 17:53:57 +01:00
Matthew Mosesohn
203ddfcd43 Merge pull request #1023 from mattymo/fix_dnsmasq_cleanup
Clean up dnsmasq purge task
2017-02-13 19:50:01 +03:00
Vladimir Rutsky
09847567ae set "check_mode: no" for read-only "shell" steps that registers result
"shell" step doesn't support check mode, which currently leads to failures,
when Ansible is being run in check mode (because Ansible doesn't run command,
assuming that command might have effect, and no "rc" or "output" is registered).

Setting "check_mode: no" allows to run those "shell" commands in check mode
(which is safe, because those shell commands doesn't have side effects).
2017-02-13 18:53:41 +03:00
Sergii Golovatiuk
732ae69d22 Install pip on Ubuntu
Closes: #961
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-02-13 16:27:09 +01:00
Greg Althaus
2b10376339 When resolv.conf changes during host_resolvconf mode, we need to
restart the controller to get the new file configuration.
I'm not fond of this form and would like a better way, but this
seems to "work".
2017-02-13 09:20:02 -06:00
Antoine Legrand
9667ac3baf Merge pull request #1022 from kubernetes-incubator/ant31-patch-1
Document gitlab-runner.sh
2017-02-13 15:40:34 +01:00
Matthew Mosesohn
b5be335db3 Clean up dnsmasq purge task 2017-02-13 17:30:15 +03:00
Antoine Legrand
d33945780d Document gitlab-runner.sh 2017-02-13 15:04:35 +01:00
Sergii Golovatiuk
5f4cc3e1de Replace always_run with check_mode
always_run was deprecated in Ansible 2.2 and will be removed in 2.4
ansible logs contain "[DEPRECATION WARNING]: always_run is deprecated.
Use check_mode = no instead". This patch fix deprecation.
2017-02-13 15:00:56 +01:00
Matthew Mosesohn
ec567bd53c Update calico to v1.0.2
Also calico-cni to v1.5.6, calico-policy to v0.5.2

Fixes: #1011
2017-02-13 15:39:25 +03:00
Sergii Golovatiuk
aeadaa1184 Set ssl_ca_dirs for rkt based on fact
Since systemd kubelet.service has {{ ssl_ca_dirs }}, fact should be
gathered before writing kubelet.service.

Closes: #1007
Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-02-13 13:28:29 +01:00
Matthew Mosesohn
2f0f0006e3 Merge pull request #988 from mattymo/feat/rolling3
Add CI cases for testing upgrade from v2.0.1 release
2017-02-10 18:09:43 +03:00
Matthew Mosesohn
de047a2b8c Merge pull request #983 from vwfs/centos_kernel_upgrade
Add kernel upgrade for CentOS
2017-02-10 14:40:27 +03:00
Antoine Legrand
86a35652bb Merge pull request #1009 from mattymo/dnsmasq_updates
Enable reset of dnsmasq if manifest or config changes
2017-02-10 11:43:09 +01:00
Matthew Mosesohn
6ae70e03cb fixup upgrades for canal and weave 2017-02-10 13:27:41 +03:00
Matthew Mosesohn
2c532cb74d Disable kube_proxy_masquerade_all
Fixes #1012
2017-02-10 13:16:39 +03:00
Matthew Mosesohn
779f20d64e Merge pull request #1010 from bogdando/fixes
Fix misleading HA docs
2017-02-10 13:01:29 +03:00
Bogdan Dobrelya
89ae9f1f88 Merge pull request #1002 from code0x9/master
use ansible sysctl module for config ip forwarding
2017-02-10 10:40:18 +01:00
Bogdan Dobrelya
ed1ab11001 Fix misleading HA docs
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-02-10 10:28:27 +01:00
Alexander Block
d2e010cbe1 Add kernel upgrade for CentOS 2017-02-10 09:29:12 +01:00
Matthew Mosesohn
a44a0990f5 Enable reset of dnsmasq if manifest or config changes 2017-02-10 10:40:07 +04:00
Matthew Mosesohn
2f88c9eefe Merge pull request #989 from holser/kubelet_remedy
Kubernetes Reliability Improvements
2017-02-10 09:29:29 +03:00
Matthew Mosesohn
60f1936a62 Merge pull request #1004 from galthaus/kubelet-load-modules
Allow kubelet to load kernel modules
2017-02-10 09:28:16 +03:00
Matthew Mosesohn
ee15f99dd7 Add CI cases for testing upgrade from v2.0.1 release
These are manual trigger jobs, but should be run if any PR
impacts upgrades.
2017-02-10 10:20:58 +04:00
Matthew Mosesohn
b0ee27ba46 Merge pull request #1006 from mattymo/fix_weave_upgrade
Enable weave upgrade from previous versions
2017-02-10 09:03:49 +03:00
Antoine Legrand
067bbaa473 Merge pull request #1001 from idcrook/kargo-issue-1000-efk-enable
removed explicit role for efk in cluster.yml
2017-02-10 03:03:18 +01:00
Sergii Golovatiuk
c07d60bc90 Kubernetes Reliability Improvements
- Exclude kubelet CPU/RAM (kube-reserved) from cgroup. It decreases a
  chance of overcommitment
- Add a possibility to modify Kubelet node-status-update-frequency
- Add a posibility to configure node-monitor-grace-period,
  node-monitor-period, pod-eviction-timeout for Kubernetes controller
  manager
- Add Kubernetes Relaibility Documentation with recomendations for
  various scenarios.

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-02-09 23:54:08 +01:00
Matthew Mosesohn
29fd957352 Enable weave upgrade from previous versions
Raise readiness probe initial time to 60 (was 30)
2017-02-09 21:39:31 +03:00
Matthew Mosesohn
ef10ce04e2 Merge pull request #1005 from rutsky/patch-2
fix kube_apiserver_ip/kube_apiserver_port description
2017-02-09 21:08:15 +03:00
Vladimir Rutsky
f0269b28f4 fix kube_apiserver_ip/kube_apiserver_port description 2017-02-09 21:47:36 +04:00
Matthew Mosesohn
0a7c6eb9dc Merge pull request #998 from mattymo/fix_upgrade_daemonsets
Fix upgrade for all daemonset type resources
2017-02-09 20:02:21 +03:00
Greg Althaus
3f0c13af8a Make kubelet_load_modules always present but false.
Update code and docs for that assumption.
2017-02-09 10:25:44 -06:00
Greg Althaus
fcd78eb1f7 Due to the nsenter and other reworks, it appears that
kubelet lost the ability to load kernel modules.  This
puts that back by adding the lib/modules mount to kubelet.

The new variable kubelet_load_modules can be set to true
to enable this item.  It is OFF by default.
2017-02-09 10:02:26 -06:00
Matthew Mosesohn
17dfae6d4e Merge pull request #999 from holser/decrease_weave_ram_limits
Lower weave RAM settings.
2017-02-09 13:19:12 +03:00
Mark Lee
e414c25fd7 follow sysctl.conf file symlink if linked 2017-02-09 18:16:52 +09:00
Mark Lee
34a71554ae use ansible sysctl module for config ip forwarding 2017-02-09 17:28:44 +09:00
Bogdan Dobrelya
3b1a196c75 Merge pull request #902 from insequent/master
Adding vault role
2017-02-09 09:24:52 +01:00
Bogdan Dobrelya
105dbf471e Merge pull request #993 from code0x9/master
enable proxy support on docker repository
2017-02-09 09:21:01 +01:00
David Crook
d4d9f27a8d removed explicit role for efk in cluster.yml 2017-02-08 20:48:28 -07:00
Antoine Legrand
68df0d4909 Merge pull request #986 from vwfs/dnsmasq_system_nameservers
Also add the system nameservers to upstream servers in dnsmasq
2017-02-08 23:21:54 +01:00
Antoine Legrand
9c572fe54b Merge pull request #984 from rutsky/patch-2
fix typo: "explicetely"
2017-02-08 23:19:01 +01:00
Josh Conant
245e05ce61 Vault security hardening and role isolation 2017-02-08 21:41:36 +00:00
Josh Conant
f4ec2d18e5 Adding the Vault role 2017-02-08 21:31:28 +00:00
Sergii Golovatiuk
4124d84c00 Lower weave RAM settings.
- Since Weave 1.8.x was rewritten in Golang we may decrease RAM settings
  to continue using g1-small for CI
2017-02-08 18:50:36 +01:00
Matthew Mosesohn
3c713a3f53 Fix upgrade for all daemonset type resources
Daemonsets cannot be simply upgraded through a single API call,
regardless of any kubectl documentation. The resource must be
purged and then recreated in order to make any changes.
2017-02-08 18:16:00 +03:00
Alexander Block
89e570493a Also add the system nameservers to upstream servers in dnsmasq
Also make no-resolv unconditional again. Otherwise, we may end up in
a resolver loop. The resolver loop was the cause for the piling up
parallel queries.
2017-02-08 14:38:55 +01:00
Matthew Mosesohn
16674774c7 Merge pull request #994 from mattymo/docker_save
Change docker save compress level to 1
2017-02-08 15:13:15 +03:00
Matthew Mosesohn
0180ad7f38 Merge pull request #990 from mattymo/fix_cert_upgrade
Fix check for node-NODEID certs existence
2017-02-08 14:44:09 +03:00
Matthew Mosesohn
bfd1ea1da1 Merge pull request #971 from bradbeam/efk
Adding EFK logging stack
2017-02-08 14:28:04 +03:00
Mark Lee
3eacd0c871 Update rh_docker.repo.j2 2017-02-08 20:03:51 +09:00
Matthew Mosesohn
d587270293 Merge pull request #992 from vwfs/host_mount_dev
Host mount /dev for kubelet
2017-02-08 13:45:22 +03:00
Matthew Mosesohn
3eb13e83cf Change docker save compress level to 1
Faster gzip improves CI deploy times by at least 2 mins.

Fixes #982
2017-02-08 13:25:11 +03:00
Mark Lee
df761713aa Merge branch 'master' of https://github.com/kubespray/kargo 2017-02-08 19:19:26 +09:00
Mark Lee
de50f37fea enable proxy support on docker repository 2017-02-08 19:19:08 +09:00
Matthew Mosesohn
bad6076905 Merge pull request #987 from mattymo/etcd-retune
Re-tune ETCD performance params
2017-02-08 13:00:25 +03:00
Bogdan Dobrelya
c2bd76a22e Merge pull request #956 from adidenko/update-netchecker
Update playbooks to support new netchecker
2017-02-08 10:09:46 +01:00
Alexander Block
010fe30b53 Host mount /dev for kubelet 2017-02-08 09:55:51 +01:00
Matthew Mosesohn
e5779ab786 Fix check for node-NODEID certs existence
Fixes upgrade from pre-individual node cert envs.
2017-02-07 21:06:48 +03:00
Matthew Mosesohn
71e14a13b4 Re-tune ETCD performance params
Reduce election timeout to 5000ms (was 10000ms)
Raise heartbeat interval to 250ms (was 100ms)
Remove etcd cpu share (was 300)
Make etcd_cpu_limit and etcd_memory_limit optional.
2017-02-07 20:15:14 +03:00
Matthew Mosesohn
491074aab1 Merge pull request #969 from mattymo/port_reserve
Prevent dynamic port allocation in nodePort range
2017-02-07 18:24:57 +03:00
Aleksandr Didenko
54af533b31 Update playbooks to support new netchecker
Netchecker is rewritten in Go lang with some new args instead of
env variables. Also netchecker-server no longer requires kubectl
container. Updating playbooks accordingly.
2017-02-07 15:20:34 +01:00
Matthew Mosesohn
4f13043d14 Merge pull request #976 from holser/bug/975
Improve Weave
2017-02-06 22:48:13 +03:00
Vladimir Rutsky
6a5df4d999 fix typo: "pubilcally" 2017-02-06 21:35:02 +04:00
Vladimir Rutsky
d41602088b fix typo: "explicetely" 2017-02-06 21:29:11 +04:00
Matthew Mosesohn
f3a0f73588 Prevent dynamic port allocation in nodePort range
kube_apiserver_node_port_range should be accessible only
to kube-proxy and not be taken by a dynamic port allocation.

Potentially temporary if https://github.com/kubernetes/kubernetes/issues/40920
gets fixed.
2017-02-06 20:01:16 +03:00
Matthew Mosesohn
be1e1b41bd Merge pull request #981 from kubernetes-incubator/revert-911-DROP_CAPS
Revert "Drop linux capabilities and rework users/groups"
2017-02-06 17:52:58 +03:00
Matthew Mosesohn
fd30131dc2 Revert "Drop linux capabilities and rework users/groups" 2017-02-06 15:58:54 +03:00
Sergii Golovatiuk
5122697f0b Improve Weave
- Remove weave CPU limits from .gitlab-ci.yml. Closes: #975
- Fix weave version in documentation

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-02-06 13:24:40 +01:00
Bogdan Dobrelya
b7bf502e02 Merge pull request #978 from rutsky/patch-1
remove extra `~`
2017-02-06 12:07:54 +01:00
Bogdan Dobrelya
3f70e3a843 Merge pull request #977 from holser/bug/973
Add .swp .swo .swn to .gitignore
2017-02-06 12:07:07 +01:00
Bogdan Dobrelya
cae2982d81 Merge pull request #911 from bogdando/DROP_CAPS
Drop linux capabilities and rework users/groups
2017-02-06 12:05:51 +01:00
Vladimir Rutsky
b638c89556 remove extra ~ 2017-02-06 15:05:24 +04:00
Bogdan Dobrelya
9bc51bd0e2 Merge pull request #972 from kubernetes-incubator/update-roadmap
Update roadmap.md
2017-02-06 12:03:09 +01:00
Sergii Golovatiuk
408b4f3f42 Add .swp .swo .swn to .gitignore
According to http://vimdoc.sourceforge.net/htmldoc/recover.html vim
creates .swo .swn .swp files. This patch adds them to .gitignore in all
directories recursively

Closes: #973
2017-02-06 12:00:49 +01:00
Antoine Legrand
d818ac1d59 Update roadmap.md 2017-02-04 23:23:24 +01:00
Antoine Legrand
bd1c764a1a Merge pull request #963 from rutsky/bastion-ansible-host
handle both 'ansible_host' and 'ansible_ssh_host' in bastion configration
2017-02-04 15:42:39 -05:00
Antoine Legrand
8f377ad8bd Merge pull request #968 from rutsky/remove-deprecated-ubuntu-bootstrap
remove deprecated ubuntu-bootstrap.yml script
2017-02-04 15:36:49 -05:00
Brad Beam
df3e11bdb8 Adding EFK logging stack 2017-02-03 16:27:08 -06:00
Vladimir Rutsky
97dabbe997 remove deprecated ubuntu-bootstrap.yml script
Signed-off-by: Vladimir Rutsky <rutsky.vladimir@gmail.com>
2017-02-03 15:02:17 +03:00
Bogdan Dobrelya
5a7a3f6d4a Merge pull request #949 from vmtyler/master
Fixes Support for OpenStack v3 credentials
2017-02-03 12:22:00 +01:00
Vladimir Rutsky
b4327fdc99 handle both 'ansible_host' and 'ansible_ssh_host' in bastion configuration
'absible_ssh_host' is deprecated in Ansible 2.0 and at least
'contrib/inventory_builder/inventory.py' uses 'ansible_host' instead.
2017-02-02 18:34:53 +03:00
Matthew Mosesohn
10f924a617 Merge pull request #927 from holser/nsenter_fix
Remove nsenter workaround
2017-02-02 18:18:15 +03:00
Matthew Mosesohn
3dd6a01c8b Merge pull request #901 from galthaus/dns-tweak
DHCP Hook protections
2017-02-02 16:47:16 +03:00
Sergii Golovatiuk
585afef945 Remove nsenter workaround
- Docker 1.12 and further don't need nsenter hack. This patch removes
  it.  Also, it bumps the minimal version to 1.12.

Closes #776

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-02-02 14:38:11 +01:00
Matthew Mosesohn
bdc65990e1 Merge pull request #958 from holser/fix_weave_cpu
Fix CPU out of scope for Weave-net
2017-02-02 16:05:47 +03:00
Sergii Golovatiuk
f2e4ffcac2 Fix weave-net after upgrade to 1.82
- Set recommended CPU settings
- Cleans up upgrade to weave 1.82. The original WeaveWorks
daemonset definition uses weave-net name.
- Limit DS creation to master
- Combined 2 tasks into one with better condition
2017-02-02 10:31:58 +01:00
Matthew Mosesohn
ae66b6e648 Merge pull request #957 from mattymo/weave-net-naming
Rename weave-kube to weave-net
2017-02-02 10:18:02 +03:00
Greg Althaus
923057c1a8 This continues the DHCP hook checks. Also protect the create side
if the system doesn't have any config files at all.
2017-01-31 09:56:27 -06:00
Matthew Mosesohn
0f6e08d34f Merge pull request #951 from mattymo/k8s-certs-scale
Fix cert distribution at scale
2017-01-31 18:49:26 +03:00
Matthew Mosesohn
4889a3e2e1 Merge pull request #954 from artem-panchenko/improve_dnsmasq
Explicitly set config path for DNSMasq
2017-01-31 18:48:46 +03:00
Matthew Mosesohn
39d87a96aa Rename weave-kube to weave-net
Cleans up upgrade to weave 1.82. The original WeaveWorks
daemonset definition uses weave-net name.
2017-01-31 18:47:27 +03:00
Bogdan Dobrelya
e7c03ba66a Merge pull request #955 from mattymo/disable-idempotency-check
Disable idempotency for ubuntu-weave-sep
2017-01-31 14:55:27 +01:00
Matthew Mosesohn
08822ec684 Fix cert distribution at scale
Use stdin instead of bash args to pass node filenames and base64 data.
Use tempfile for master cert data
2017-01-31 16:27:45 +03:00
Matthew Mosesohn
6463a01e04 Merge pull request #880 from bradbeam/weave-kube
Weave kube
2017-01-31 13:31:09 +03:00
Matthew Mosesohn
0cf1850465 Disable idempotency for ubuntu-weave-sep
CI is failing 40% of the time due to errors in reset.
Let's disable idempotency check per-patch until we fix it.

Fixes #953
2017-01-31 13:23:27 +03:00
Artem Panchenko
1418fb394b Explicitly set config path for DNSMasq
When DNSMasq is configured to read its settings
from a folder ('-7' or '--conf-dir' option) it only
checks that the directory exists and doesn't fail if
it's empty. It could lead to a situation when DNSMasq
is running and handles requests, but not properly
configured, so some of queries can't be resolved.
2017-01-31 12:14:57 +02:00
Matthew Mosesohn
e4eda88ca9 Merge pull request #944 from tureus/skip-cloud-config-on-etcd
Bugfix: skip cloud_config on etcd
2017-01-30 20:12:36 +03:00
Bogdan Dobrelya
71a3c97d6f Merge pull request #943 from bradbeam/cilint
Fixing lint check for ci
2017-01-30 09:19:44 +01:00
Antoine Legrand
1c3d2924ae Merge pull request #947 from bradbeam/libs
Consolidating kube.py module
2017-01-29 00:02:32 +01:00
Brad Beam
a11b9d28bd Upgrading weave to weave-kube 2017-01-27 17:05:25 -06:00
Brad Beam
b54eb609bf Consolidating kube.py module 2017-01-27 11:28:11 -06:00
Bogdan Dobrelya
dc8ff413f9 Merge pull request #948 from mattymo/update_coreos
Update coreos-stable image
2017-01-27 17:53:17 +01:00
Tyler Britten
f8ffa1601d Fixed for non-null output 2017-01-27 10:47:59 -05:00
Tyler Britten
da01bc1fbb Updated OpenStack vars to check for tenant_id (v2) and project_id (v3) 2017-01-27 10:26:20 -05:00
Matthew Mosesohn
a2079a9ca9 Update coreos-stable image
Our old coreos-stable image has docker 1.10
2017-01-27 16:20:40 +04:00
neith00
bbc8c09753 Using the command module instead of raw
Using the command module instead of raw.
Also fixed the syntax.
2017-01-26 16:28:48 +01:00
Matthew Mosesohn
a627299468 Merge pull request #941 from adidenko/use_ansible_hostname_in_calico
Switch to ansible_hostname in calico
2017-01-26 13:06:35 +03:00
Xavier Lange
e5fdc63bdd Bugfix: skip cloud_config on etcd 2017-01-25 14:09:21 -08:00
Brad Beam
fe83e70074 Fixing lint check for ci 2017-01-25 09:54:32 -06:00
Aleksandr Didenko
46c177b982 Switch to ansible_hostname in calico
For consistancy with kubernetes services we should use the same
hostname for nodes, which is 'ansible_hostname'.

Also fixing missed 'kube-node' in templates, Calico is installed
on 'k8s-cluster' roles, not only 'kube-node'.
2017-01-25 11:49:58 +01:00
Bogdan Dobrelya
1df50adc1c Merge pull request #933 from frozenice/hide-skipped-hosts
add skippy stdout callback
2017-01-25 10:33:20 +01:00
Bogdan Dobrelya
b6cd9a4c4b Merge pull request #938 from bradbeam/ci
Splitting out moderator check from syntax check
2017-01-25 10:12:11 +01:00
Brad Beam
2333ec4d1f Splitting out moderator check from syntax check
- Attempt to clarify CI runs from contributors
2017-01-24 23:05:12 -06:00
Bogdan Dobrelya
85a8a54d3e Merge pull request #935 from sc68cal/terraform_groupvars_update
Update the group_vars for Terraform
2017-01-24 11:33:17 +01:00
Bogdan Dobrelya
7294a22901 Merge pull request #934 from frozenice/use-api-pwd-for-root
also use kube_api_pwd for root account
2017-01-24 11:24:02 +01:00
Matthew Mosesohn
f4b7474ade Merge pull request #926 from adidenko/fix-calico-rr-for-masters
Fix calico-rr peering with k8s masters
2017-01-24 12:38:52 +03:00
Matthew Mosesohn
9428321607 Merge pull request #932 from vwfs/centos_pin_docker_version
Pin docker version on RedHat and CentOS to the desired version
2017-01-24 12:21:50 +03:00
Matthew Mosesohn
882544446a Merge pull request #928 from sc68cal/terraform_identity_version
Specify the version of the credentials to download from Horizon
2017-01-24 12:21:27 +03:00
Sean M. Collins
73160c9b90 Update terraform's group_vars to be a symlink
That way, it will not become stale.

Related bug #929
2017-01-23 16:08:37 -05:00
Sean M. Collins
2184d6a3ff Specify the version of the credentials to download from Horizon
More recent versions of OpenStack Horizon provide Identity v2 and
Identity v3 versions of the RC file.
2017-01-23 14:52:51 -05:00
David Kirstein
6e35895b44 also use kube_api_pwd for root account
This makes it a bit more secure. Also the password can now be changed with a (inventory) variable (no need to edit all.yml).
2017-01-23 19:09:30 +01:00
David Kirstein
8009ff8537 add skippy stdout callback
It removes the teal lines when a host is skipped for a task. This makes the output less spammy and much easier to read. Empty TASK blocks are still included in the output, but that's ok.
2017-01-23 18:53:14 +01:00
Alexander Block
9bf792ce0b Pin docker version on RedHat and CentOS to the desired version 2017-01-23 12:39:54 +01:00
Aleksandr Didenko
f05aaeb329 Fix calico-rr peering with k8s masters
Calico-rr is broken for deployments with separate k8s-master and
k8s-node roles. In order to fix it we should peer k8s-cluster
nodes with calico-rr, not just k8s-node. The same for peering
with routers.

Closes #925
2017-01-23 10:19:09 +01:00
Bogdan Dobrelya
1bdf34e7dc Merge pull request #915 from bradbeam/ci
Sorting ansible args, fixed ci cluster_mode
2017-01-20 09:43:10 +01:00
Bogdan Dobrelya
cd25bfca91 Merge pull request #884 from mattymo/inventory_builder_scale
Add scale thresholds to split etcd and k8s-masters
2017-01-20 09:34:45 +01:00
Bogdan Dobrelya
1b621ab81c Merge pull request #873 from crodetsky/fix_test_cases
Genericize test cases and namespace create pod
2017-01-20 09:30:35 +01:00
Bogdan Dobrelya
cb2e5ac776 Drop linux capabilities and rework users/groups
* Drop linux capabilities for unprivileged containerized
  worlkoads Kargo configures for deployments.
* Configure required securityContext/user/group/groups for kube
  components' static manifests, etcd, calico-rr and k8s apps,
  like dnsmasq daemonset.
* Rework cloud-init (etcd) users creation for CoreOS.
* Fix nologin paths, adjust defaults for addusers role and ensure
  supplementary groups membership added for users.
* Add netplug user for network plugins (yet unused by privileged
  networking containers though).
* Grant the kube and netplug users read access for etcd certs via
  the etcd certs group.
* Grant group read access to kube certs via the kube cert group.
* Remove priveleged mode for calico-rr and run it under its uid/gid
  and supplementary etcd_cert group.
* Adjust docs.
* Align cpu/memory limits and dropped caps with added rkt support
  for control plane.

Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-01-20 08:50:42 +01:00
Matthew Mosesohn
8ce32eb3e1 Merge pull request #905 from galthaus/async-runs
Add tasks to ensure that the first nodes have their directories for cert gen
2017-01-19 18:32:27 +03:00
Matthew Mosesohn
aae0314bda Merge pull request #904 from galthaus/nginx-port-config
Add nginx local balancer port configuration variable
2017-01-19 18:31:57 +03:00
Matthew Mosesohn
35d5248d41 Merge pull request #913 from galthaus/apps-master-only
Ansible apps should only check for api-server running on the master.
2017-01-19 18:30:58 +03:00
Matthew Mosesohn
0ccc2555d3 Merge pull request #917 from mattymo/rkt_resolvconf
Fix setting resolvconf when using rkt deploy mode
2017-01-19 18:30:21 +03:00
Matthew Mosesohn
b26a711e96 Merge pull request #916 from mattymo/update_ansible
Update Ansible to 2.2.1
2017-01-19 18:13:45 +03:00
Matthew Mosesohn
2218a052b2 Merge pull request #921 from mattymo/docker113
Add docker 1.13, update 1.12 to 1.12.6
2017-01-19 18:13:21 +03:00
Matthew Mosesohn
40f419ca54 Merge pull request #922 from holser/dnsmasq_dns-forward-max
Allow to specify number of concurrent DNS queries
2017-01-19 18:08:04 +03:00
Matthew Mosesohn
f742fc3dd1 Add scale thresholds to split etcd and k8s-masters
Also adds calico-rr group if there are standalone etcd nodes.
Now if there are 50 or more nodes, 3 etcd nodes will be standalone.
If there are 200 or more nodes, 2 kube-masters will be standalone.
If thresholds are exceeded, kube-node group cannot add nodes that
belong to etcd or kube-master groups (according to above statements).
2017-01-19 17:30:56 +03:00
Matthew Mosesohn
33fbcc56d6 Add docker 1.13, update 1.12 to 1.12.6
Fixes #903
2017-01-19 13:58:36 +03:00
Sergii Golovatiuk
61d05dea58 Allow to specify number of concurrent DNS queries
ndots creates overhead as every pod creates 5 concurrent connections
that are forwarded to sky dns. Under some circumstances dnsmasq may
prevent forwarding traffic with "Maximum number of concurrent DNS
queries reached" in the logs.

This patch allows to configure the number of concurrent forwarded DNS
queries "dns-forward-max" as well as "cache-size" leaving the default
values as they were before.

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-01-19 11:47:37 +01:00
Matthew Mosesohn
8a821060a3 Update Ansible to 2.2.1 2017-01-19 13:46:46 +03:00
Greg Althaus
0d44599a63 Add explicit name printing in task names for deletgated task during
cert creation
2017-01-18 14:06:50 -06:00
crodetsky
8e29b08070 Genericize test cases and namespace create pod
This change modifies 020_check-create-pod and 030_check-network test cases to
target `kube-master[0]` instead of `node1` as these tests can be useful in
deployments that do not use the same naming convention as the basic tests.

This change also modifies 020_check-create-pod to namespace into a `test`
namespace allowing the `get pods` command to get its expected number of
running containers.

Closes #866 and #867.
2017-01-18 14:52:35 -05:00
Matthew Mosesohn
b6c3e61603 Fix setting resolvconf when using rkt deploy mode
rkt deploy mode doesn't create {{ bin_dir }}/kubelet, so
let's rely on kubelet.env file instad.
2017-01-18 19:18:47 +03:00
Brad Beam
dc08b75c6a Sorting ansible args, fixed ci cluster_mode
- s/separated/separate/g for cluster_mode so it now generates the correct number of instances
2017-01-18 08:03:04 -06:00
Matthew Mosesohn
5420fa942e Merge pull request #897 from holser/flush_handlers_before_etcd
Flush handlers before etcd restart
2017-01-18 12:27:01 +03:00
Matthew Mosesohn
1ee33d3a8d Merge pull request #910 from mattymo/escape_curly
Fix ansible 2.2.1 handling of registered vars
2017-01-18 11:13:01 +03:00
Greg Althaus
61dab8dc0b Should only check for api-server running on the master.
If this runs on other nodes, it will fail the playbook.
2017-01-17 15:57:34 -06:00
Greg Althaus
0022a2b29e Add doc updates. 2017-01-17 13:15:48 -06:00
Matthew Mosesohn
b2a27ed089 Fix bash completion installation 2017-01-17 20:36:58 +03:00
Matthew Mosesohn
d8ae50800a Work around escaping curly braces for docker inspect 2017-01-17 20:35:38 +03:00
Sergii Golovatiuk
43fa72b7b7 Flush handlers before etcd restart
systemctl daemon-reload should be run before when task modifies/creates
union for etcd. Otherwise etcd won't be able to start

Closes #892

Signed-off-by: Sergii Golovatiuk <sgolovatiuk@mirantis.com>
2017-01-17 15:04:25 +01:00
Bogdan Dobrelya
36b62b7270 Merge pull request #896 from bogdando/idempot_check
Add idempotency checks for CI
2017-01-17 14:21:32 +01:00
Matthew Mosesohn
73204c868d Merge pull request #909 from mattymo/docker-upgrade
Always trigger docker restart when docker package changes
2017-01-17 11:37:42 +03:00
Matthew Mosesohn
2ee889843a Merge pull request #900 from galthaus/cn-length
Cert fail if inventory names too long
2017-01-16 23:39:32 +03:00
Matthew Mosesohn
74b78e75a1 Always trigger docker restart when docker package changes
Docker upgrade doesn't auto-restart docker, causing failures
when trying to start another container
2017-01-16 17:52:28 +03:00
Greg Althaus
6905edbeb6 Add a variable that defaults to kube_apiserver_port that defines
the which port the local nginx proxy should listen on for HA
local balancer configurations.
2017-01-14 23:38:07 -06:00
Greg Althaus
6c69da1573 This PR adds/or modifies a few tasks to allow for the playbook to
be run by limit on each node without regard for order.

The changes make sure that all of the directories needed to do
certificate management are on the master[0] or etcd[0] node regardless
of when the playbook gets run on each node.  This allows for separate
ansible playbook runs in parallel that don't have to be synchronized.
2017-01-14 23:24:34 -06:00
Bogdan Dobrelya
e776dfd800 Add idempotency checks for CI
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-01-13 17:16:03 +01:00
Greg Althaus
95bf380d07 If the inventory name of the host exceeds 63 characters,
the openssl tools will fail to create signing requests because
the CN is too long.  This is mainly a problem when FQDNs are used
in the inventory file.

THis will truncate the hostname for the CN field only at the
first dot.  This should handle the issue for most cases.
2017-01-13 10:02:23 -06:00
Bogdan Dobrelya
2a61ad1b57 Merge pull request #895 from mattymo/same_apiserver_certs
Use only one certificate for all apiservers
2017-01-13 13:05:06 +01:00
Matthew Mosesohn
80703010bd Use only one certificate for all apiservers
https://github.com/kubernetes/kubernetes/issues/25063
2017-01-13 14:03:20 +03:00
Bogdan Dobrelya
e88c10670e Merge pull request #891 from galthaus/selinux-order
preinstall fails on AWS CentOS7 image
2017-01-13 11:51:18 +01:00
Bogdan Dobrelya
2a2953c674 Merge pull request #893 from kubernetes-incubator/undo_hostresolvconf
Don't try to delete kargo specific config from dhclient when file does not exist
2017-01-13 11:35:46 +01:00
Alexander Block
1054f37765 Don't try to delete kargo specific config from dhclient when file does not exist
Also remove the check for != "RedHat" when removing the dhclient hook,
as this had also to be done on other distros. Instead, check if the
dhclienthookfile is defined.
2017-01-13 10:56:10 +01:00
Greg Althaus
f77257cf79 When running on CentOS7 image in AWS with selinux on, the order of
the tasks fail because selinux prevents ip-forwarding setting.

Moving the tasks around addresses two issues.  Makes sure that
the correct python tools are in place before adjusting of selinux
and makes sure that ipforwarding is toggled after selinux adjustments.
2017-01-12 10:12:21 -06:00
Bogdan Dobrelya
f004cc07df Merge pull request #830 from mattymo/k8sperhost
Generate individual certificates for k8s hosts
2017-01-12 12:42:14 +01:00
Bogdan Dobrelya
065a4da72d Merge pull request #886 from kubernetes-incubator/undo_hostresolvconf
Add tasks to undo changes to hosts /etc/resolv.conf and dhclient configs
2017-01-12 12:27:22 +01:00
Bogdan Dobrelya
98c7f2eb13 Merge pull request #887 from bogdando/docs
Clarify major/minor/maintainance releases
2017-01-12 11:55:00 +01:00
Bogdan Dobrelya
d332502d3d Clarify major/minor/maintainance releases
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-01-12 11:25:04 +01:00
Alexander Block
a7bf7867d7 Add tasks to undo changes to hosts /etc/resolv.conf and dhclient configs 2017-01-11 16:56:16 +01:00
Bogdan Dobrelya
c63cda7c21 Merge pull request #883 from bogdando/docs
Docs updates
2017-01-11 15:40:41 +01:00
Bogdan Dobrelya
caab0cdf27 Docs updates
Fix mismatching inventory examples.
Add command examples.
Clarify groups use cases.

Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-01-11 15:39:35 +01:00
Bogdan Dobrelya
1191876ae8 Merge pull request #882 from bogdando/releases
Clarify release policy
2017-01-11 11:45:47 +01:00
Bogdan Dobrelya
fa51a589ef Clarify release policy
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-01-11 11:18:21 +01:00
Matthew Mosesohn
3f274115b0 Generate individual certificates for k8s hosts 2017-01-11 12:58:07 +03:00
Matthew Mosesohn
3b0918981e Merge pull request #878 from bradbeam/rkt-cni
Adding /opt/cni /etc/cni to rkt run kubelet
2017-01-11 12:22:04 +03:00
Bogdan Dobrelya
a327dfeed7 Merge pull request #881 from bogdando/docs
Fix inventory generator link
2017-01-10 17:09:35 +01:00
Bogdan Dobrelya
d8cef34d6c Merge pull request #872 from mattymo/bug868
Bind nginx localhost proxy to localhost
2017-01-10 17:09:25 +01:00
Bogdan Dobrelya
6fb6947feb Fix inventory generator link
Signed-off-by: Bogdan Dobrelya <bogdando@mail.ru>
2017-01-10 17:02:28 +01:00
Brad Beam
db8173da28 Adding /opt/cni /etc/cni to rkt run kubelet 2017-01-10 08:48:58 -06:00
Bogdan Dobrelya
bcdfb3cfb0 Merge pull request #793 from kubernetes-incubator/fix_dhclientconf_path
Fix wrong path of dhclient on CentOS+Azure
2017-01-10 13:23:55 +01:00
Bogdan Dobrelya
79aeb10431 Merge pull request #858 from bradbeam/calicoctl-canal
Misc updates for canal
2017-01-10 12:24:59 +01:00
Matthew Mosesohn
e22f938ae5 Bind nginx localhost proxy to localhost
This proxy should only be listening for local connections, not 0.0.0.0.

Fixes #868
2017-01-09 17:19:54 +03:00
Brad Beam
cf042b2a4c Create network policy directory for canal 2017-01-05 10:54:27 -06:00
Brad Beam
65c86377fc Adding calicoctl to canal deployment 2017-01-05 10:54:27 -06:00
Alexander Block
8e4e3998dd Fix wrong path of dhclient on CentOS+Azure
This was alredy fixed in #755 but had to be reverted. This PR should be
more intelligent about deciding which path to use.
2016-12-21 21:51:07 +01:00
312 changed files with 6829 additions and 2243 deletions

5
.gitignore vendored
View File

@@ -1,6 +1,8 @@
.vagrant
*.retry
inventory/vagrant_ansible_inventory
inventory/group_vars/fake_hosts.yml
inventory/host_vars/
temp
.idea
.tox
@@ -10,4 +12,7 @@ temp
*.pyo
*.tfstate
*.tfstate.backup
**/*.sw[pon]
/ssh-bastion.conf
**/*.sw[pon]
vagrant/

View File

@@ -1,4 +1,5 @@
stages:
- moderator
- unit-tests
- deploy-gce-part1
- deploy-gce-part2
@@ -17,7 +18,7 @@ variables:
# us-west1-a
before_script:
- pip install ansible
- pip install ansible==2.2.1.0
- pip install netaddr
- pip install apache-libcloud==0.20.1
- pip install boto==2.9.0
@@ -46,14 +47,23 @@ before_script:
PRIVATE_KEY: $GCE_PRIVATE_KEY
GS_ACCESS_KEY_ID: $GS_KEY
GS_SECRET_ACCESS_KEY: $GS_SECRET
CLOUD_MACHINE_TYPE: "g1-small"
ANSIBLE_KEEP_REMOTE_FILES: "1"
ANSIBLE_CONFIG: ./tests/ansible.cfg
BOOTSTRAP_OS: none
DOWNLOAD_LOCALHOST: "false"
DOWNLOAD_RUN_ONCE: "false"
IDEMPOT_CHECK: "false"
RESET_CHECK: "false"
UPGRADE_TEST: "false"
RESOLVCONF_MODE: docker_dns
LOG_LEVEL: "-vv"
ETCD_DEPLOYMENT: "docker"
KUBELET_DEPLOYMENT: "docker"
VAULT_DEPLOYMENT: "docker"
WEAVE_CPU_LIMIT: "100m"
MAGIC: "ci check this"
.gce: &gce
<<: *job
<<: *docker_service
@@ -62,65 +72,172 @@ before_script:
paths:
- downloads/
- $HOME/.cache
stage: deploy-gce
before_script:
- docker info
- pip install ansible==2.1.3.0
- pip install ansible==2.2.1.0
- pip install netaddr
- pip install apache-libcloud==0.20.1
- pip install boto==2.9.0
- mkdir -p /.ssh
- cp tests/ansible.cfg .
- mkdir -p $HOME/.ssh
- echo $PRIVATE_KEY | base64 -d > $HOME/.ssh/id_rsa
- echo $GCE_PEM_FILE | base64 -d > $HOME/.ssh/gce
- echo $GCE_CREDENTIALS > $HOME/.ssh/gce.json
- chmod 400 $HOME/.ssh/id_rsa
- ansible-playbook --version
- cp tests/ansible.cfg .
- export PYPATH=$([ $BOOTSTRAP_OS = none ] && echo /usr/bin/python || echo /opt/bin/python)
script:
- pwd
- ls
- echo ${PWD}
- >
ansible-playbook tests/cloud_playbooks/create-gce.yml -i tests/local_inventory/hosts.cfg -c local $LOG_LEVEL
-e mode=${CLUSTER_MODE}
-e test_id=${TEST_ID}
-e kube_network_plugin=${KUBE_NETWORK_PLUGIN}
ansible-playbook tests/cloud_playbooks/create-gce.yml -i tests/local_inventory/hosts.cfg -c local
${LOG_LEVEL}
-e cloud_image=${CLOUD_IMAGE}
-e cloud_region=${CLOUD_REGION}
-e gce_credentials_file=${HOME}/.ssh/gce.json
-e gce_project_id=${GCE_PROJECT_ID}
-e gce_service_account_email=${GCE_ACCOUNT}
-e gce_credentials_file=${HOME}/.ssh/gce.json
-e cloud_image=${CLOUD_IMAGE}
-e cloud_machine_type=${CLOUD_MACHINE_TYPE}
-e inventory_path=${PWD}/inventory/inventory.ini
-e cloud_region=${CLOUD_REGION}
-e kube_network_plugin=${KUBE_NETWORK_PLUGIN}
-e mode=${CLUSTER_MODE}
-e test_id=${TEST_ID}
# Check out latest tag if testing upgrade
# Uncomment when gitlab kargo repo has tags
#- test "${UPGRADE_TEST}" != "false" && git fetch --all && git checkout $(git describe --tags $(git rev-list --tags --max-count=1))
- test "${UPGRADE_TEST}" != "false" && git checkout 031cf565ec3ccd3ebbe80eeef3454c3780e5c598 && pip install ansible==2.2.0
# Create cluster
- >
ansible-playbook -i inventory/inventory.ini -u $SSH_USER -e ansible_ssh_user=$SSH_USER $SSH_ARGS
-b --become-user=root -e cloud_provider=gce $LOG_LEVEL -e kube_network_plugin=${KUBE_NETWORK_PLUGIN}
--private-key=${HOME}/.ssh/id_rsa
-e bootstrap_os=${BOOTSTRAP_OS}
ansible-playbook -i inventory/inventory.ini -b --become-user=root --private-key=${HOME}/.ssh/id_rsa -u $SSH_USER
${SSH_ARGS}
${LOG_LEVEL}
-e ansible_python_interpreter=${PYPATH}
-e download_run_once=true
-e download_localhost=true
-e ansible_ssh_user=${SSH_USER}
-e bootstrap_os=${BOOTSTRAP_OS}
-e cert_management=${CERT_MGMT:-script}
-e cloud_provider=gce
-e deploy_netchecker=true
-e resolvconf_mode=${RESOLVCONF_MODE}
-e local_release_dir=${PWD}/downloads
-e download_localhost=${DOWNLOAD_LOCALHOST}
-e download_run_once=${DOWNLOAD_RUN_ONCE}
-e etcd_deployment_type=${ETCD_DEPLOYMENT}
-e kube_network_plugin=${KUBE_NETWORK_PLUGIN}
-e kubelet_deployment_type=${KUBELET_DEPLOYMENT}
-e local_release_dir=${PWD}/downloads
-e resolvconf_mode=${RESOLVCONF_MODE}
-e vault_deployment_type=${VAULT_DEPLOYMENT}
--limit "all:!fake_hosts"
cluster.yml
# Repeat deployment if testing upgrade
- >
if [ "${UPGRADE_TEST}" != "false" ]; then
test "${UPGRADE_TEST}" == "basic" && PLAYBOOK="cluster.yml";
test "${UPGRADE_TEST}" == "graceful" && PLAYBOOK="upgrade-cluster.yml";
pip install ansible==2.2.1.0;
git checkout "${CI_BUILD_REF}";
ansible-playbook -i inventory/inventory.ini -b --become-user=root --private-key=${HOME}/.ssh/id_rsa -u $SSH_USER
${SSH_ARGS}
${LOG_LEVEL}
-e ansible_python_interpreter=${PYPATH}
-e ansible_ssh_user=${SSH_USER}
-e bootstrap_os=${BOOTSTRAP_OS}
-e cloud_provider=gce
-e deploy_netchecker=true
-e download_localhost=${DOWNLOAD_LOCALHOST}
-e download_run_once=${DOWNLOAD_RUN_ONCE}
-e etcd_deployment_type=${ETCD_DEPLOYMENT}
-e kube_network_plugin=${KUBE_NETWORK_PLUGIN}
-e kubelet_deployment_type=${KUBELET_DEPLOYMENT}
-e local_release_dir=${PWD}/downloads
-e resolvconf_mode=${RESOLVCONF_MODE}
-e weave_cpu_requests=${WEAVE_CPU_LIMIT}
-e weave_cpu_limit=${WEAVE_CPU_LIMIT}
--limit "all:!fake_hosts"
$PLAYBOOK;
fi
# Tests Cases
## Test Master API
- ansible-playbook -i inventory/inventory.ini -e ansible_python_interpreter=${PYPATH} -u $SSH_USER -e ansible_ssh_user=$SSH_USER $SSH_ARGS -b --become-user=root tests/testcases/010_check-apiserver.yml $LOG_LEVEL
- ansible-playbook -i inventory/inventory.ini -e ansible_python_interpreter=${PYPATH} -u $SSH_USER -e ansible_ssh_user=$SSH_USER $SSH_ARGS -b --become-user=root --limit "all:!fake_hosts" tests/testcases/010_check-apiserver.yml $LOG_LEVEL
## Ping the between 2 pod
- ansible-playbook -i inventory/inventory.ini -e ansible_python_interpreter=${PYPATH} -u $SSH_USER -e ansible_ssh_user=$SSH_USER $SSH_ARGS -b --become-user=root tests/testcases/030_check-network.yml $LOG_LEVEL
- ansible-playbook -i inventory/inventory.ini -e ansible_python_interpreter=${PYPATH} -u $SSH_USER -e ansible_ssh_user=$SSH_USER $SSH_ARGS -b --become-user=root --limit "all:!fake_hosts" tests/testcases/030_check-network.yml $LOG_LEVEL
## Advanced DNS checks
- ansible-playbook -i inventory/inventory.ini -e ansible_python_interpreter=${PYPATH} -u $SSH_USER -e ansible_ssh_user=$SSH_USER $SSH_ARGS -b --become-user=root tests/testcases/040_check-network-adv.yml $LOG_LEVEL
- ansible-playbook -i inventory/inventory.ini -e ansible_python_interpreter=${PYPATH} -u $SSH_USER -e ansible_ssh_user=$SSH_USER $SSH_ARGS -b --become-user=root --limit "all:!fake_hosts" tests/testcases/040_check-network-adv.yml $LOG_LEVEL
## Idempotency checks 1/5 (repeat deployment)
- >
if [ "${IDEMPOT_CHECK}" = "true" ]; then
ansible-playbook -i inventory/inventory.ini -u $SSH_USER -e ansible_ssh_user=$SSH_USER $SSH_ARGS
-b --become-user=root -e cloud_provider=gce $LOG_LEVEL -e kube_network_plugin=${KUBE_NETWORK_PLUGIN}
--private-key=${HOME}/.ssh/id_rsa
-e bootstrap_os=${BOOTSTRAP_OS}
-e ansible_python_interpreter=${PYPATH}
-e download_localhost=${DOWNLOAD_LOCALHOST}
-e download_run_once=${DOWNLOAD_RUN_ONCE}
-e deploy_netchecker=true
-e resolvconf_mode=${RESOLVCONF_MODE}
-e local_release_dir=${PWD}/downloads
-e etcd_deployment_type=${ETCD_DEPLOYMENT}
-e kubelet_deployment_type=${KUBELET_DEPLOYMENT}
--limit "all:!fake_hosts"
cluster.yml;
fi
## Idempotency checks 2/5 (Advanced DNS checks)
- >
if [ "${IDEMPOT_CHECK}" = "true" ]; then
ansible-playbook -i inventory/inventory.ini -e ansible_python_interpreter=${PYPATH}
-u $SSH_USER -e ansible_ssh_user=$SSH_USER $SSH_ARGS -b --become-user=root
--limit "all:!fake_hosts"
tests/testcases/040_check-network-adv.yml $LOG_LEVEL;
fi
## Idempotency checks 3/5 (reset deployment)
- >
if [ "${IDEMPOT_CHECK}" = "true" AND "${RESET_CHECK}" = "true" ]; then
ansible-playbook -i inventory/inventory.ini -u $SSH_USER -e ansible_ssh_user=$SSH_USER $SSH_ARGS
-b --become-user=root -e cloud_provider=gce $LOG_LEVEL -e kube_network_plugin=${KUBE_NETWORK_PLUGIN}
--private-key=${HOME}/.ssh/id_rsa
-e bootstrap_os=${BOOTSTRAP_OS}
-e ansible_python_interpreter=${PYPATH}
-e reset_confirmation=yes
--limit "all:!fake_hosts"
reset.yml;
fi
## Idempotency checks 4/5 (redeploy after reset)
- >
if [ "${IDEMPOT_CHECK}" = "true" AND "${RESET_CHECK}" = "true" ]; then
ansible-playbook -i inventory/inventory.ini -u $SSH_USER -e ansible_ssh_user=$SSH_USER $SSH_ARGS
-b --become-user=root -e cloud_provider=gce $LOG_LEVEL -e kube_network_plugin=${KUBE_NETWORK_PLUGIN}
--private-key=${HOME}/.ssh/id_rsa
-e bootstrap_os=${BOOTSTRAP_OS}
-e ansible_python_interpreter=${PYPATH}
-e download_localhost=${DOWNLOAD_LOCALHOST}
-e download_run_once=${DOWNLOAD_RUN_ONCE}
-e deploy_netchecker=true
-e resolvconf_mode=${RESOLVCONF_MODE}
-e local_release_dir=${PWD}/downloads
-e etcd_deployment_type=${ETCD_DEPLOYMENT}
-e kubelet_deployment_type=${KUBELET_DEPLOYMENT}
--limit "all:!fake_hosts"
cluster.yml;
fi
## Idempotency checks 5/5 (Advanced DNS checks)
- >
if [ "${IDEMPOT_CHECK}" = "true" AND "${RESET_CHECK}" = "true" ]; then
ansible-playbook -i inventory/inventory.ini -e ansible_python_interpreter=${PYPATH}
-u $SSH_USER -e ansible_ssh_user=$SSH_USER $SSH_ARGS -b --become-user=root
--limit "all:!fake_hosts"
tests/testcases/040_check-network-adv.yml $LOG_LEVEL;
fi
after_script:
- >
@@ -139,18 +256,21 @@ before_script:
.coreos_calico_sep_variables: &coreos_calico_sep_variables
# stage: deploy-gce-part1
KUBE_NETWORK_PLUGIN: calico
CLOUD_IMAGE: coreos-stable
CLOUD_IMAGE: coreos-stable-1298-6-0-v20170315
CLOUD_REGION: us-west1-b
CLUSTER_MODE: separated
CLUSTER_MODE: separate
BOOTSTRAP_OS: coreos
RESOLVCONF_MODE: host_resolvconf # This is required as long as the CoreOS stable channel uses docker < 1.12
.debian8_canal_ha_variables: &debian8_canal_ha_variables
.ubuntu_canal_ha_variables: &ubuntu_canal_ha_variables
# stage: deploy-gce-part1
KUBE_NETWORK_PLUGIN: canal
CLOUD_IMAGE: debian-8-kubespray
CLOUD_REGION: us-east1-b
CLOUD_IMAGE: ubuntu-1604-xenial
CLOUD_REGION: europe-west1-b
CLOUD_MACHINE_TYPE: "n1-standard-2"
UPGRADE_TEST: "basic"
CLUSTER_MODE: ha
UPGRADE_TEST: "graceful"
.rhel7_weave_variables: &rhel7_weave_variables
# stage: deploy-gce-part1
@@ -176,10 +296,11 @@ before_script:
.coreos_canal_variables: &coreos_canal_variables
# stage: deploy-gce-part2
KUBE_NETWORK_PLUGIN: canal
CLOUD_IMAGE: coreos-stable
CLOUD_IMAGE: coreos-stable-1298-6-0-v20170315
CLOUD_REGION: us-east1-b
CLUSTER_MODE: default
BOOTSTRAP_OS: coreos
IDEMPOT_CHECK: "true"
RESOLVCONF_MODE: host_resolvconf # This is required as long as the CoreOS stable channel uses docker < 1.12
.rhel7_canal_sep_variables: &rhel7_canal_sep_variables
@@ -187,39 +308,52 @@ before_script:
KUBE_NETWORK_PLUGIN: canal
CLOUD_IMAGE: rhel-7
CLOUD_REGION: us-east1-b
CLUSTER_MODE: separated
CLUSTER_MODE: separate
.ubuntu_weave_sep_variables: &ubuntu_weave_sep_variables
# stage: deploy-gce-special
KUBE_NETWORK_PLUGIN: weave
CLOUD_IMAGE: ubuntu-1604-xenial
CLOUD_REGION: us-central1-b
CLUSTER_MODE: separated
CLUSTER_MODE: separate
IDEMPOT_CHECK: "false"
.centos7_calico_ha_variables: &centos7_calico_ha_variables
# stage: deploy-gce-special
KUBE_NETWORK_PLUGIN: calico
DOWNLOAD_LOCALHOST: "true"
DOWNLOAD_RUN_ONCE: "true"
CLOUD_IMAGE: centos-7
CLOUD_REGION: europe-west1-b
CLUSTER_MODE: ha
CLUSTER_MODE: ha-scale
IDEMPOT_CHECK: "true"
.coreos_alpha_weave_ha_variables: &coreos_alpha_weave_ha_variables
# stage: deploy-gce-special
KUBE_NETWORK_PLUGIN: weave
CLOUD_IMAGE: coreos-alpha
CLOUD_IMAGE: coreos-alpha-1325-0-0-v20170216
CLOUD_REGION: us-west1-a
CLUSTER_MODE: ha
CLUSTER_MODE: ha-scale
BOOTSTRAP_OS: coreos
RESOLVCONF_MODE: host_resolvconf # This is required as long as the CoreOS stable channel uses docker < 1.12
.ubuntu_rkt_sep_variables: &ubuntu_rkt_sep_variables
# stage: deploy-gce-part1
KUBE_NETWORK_PLUGIN: flannel
CLOUD_IMAGE: ubuntu-1604-xenial
CLOUD_REGION: us-central1-b
CLUSTER_MODE: separated
CLUSTER_MODE: separate
ETCD_DEPLOYMENT: rkt
KUBELET_DEPLOYMENT: rkt
.ubuntu_vault_sep_variables: &ubuntu_vault_sep_variables
# stage: deploy-gce-part1
KUBE_NETWORK_PLUGIN: canal
CERT_MGMT: vault
CLOUD_IMAGE: ubuntu-1604-xenial
CLOUD_REGION: us-central1-b
CLUSTER_MODE: separate
# Builds for PRs only (premoderated by unit-tests step) and triggers (auto)
coreos-calico-sep:
stage: deploy-gce-part1
@@ -285,24 +419,24 @@ ubuntu-weave-sep-triggers:
only: ['triggers']
# More builds for PRs/merges (manual) and triggers (auto)
debian8-canal-ha:
ubuntu-canal-ha:
stage: deploy-gce-part1
<<: *job
<<: *gce
variables:
<<: *gce_variables
<<: *debian8_canal_ha_variables
<<: *ubuntu_canal_ha_variables
when: manual
except: ['triggers']
only: ['master', /^pr-.*$/]
debian8-canal-ha-triggers:
ubuntu-canal-ha-triggers:
stage: deploy-gce-part1
<<: *job
<<: *gce
variables:
<<: *gce_variables
<<: *debian8_canal_ha_variables
<<: *ubuntu_canal_ha_variables
when: on_success
only: ['triggers']
@@ -327,7 +461,7 @@ rhel7-weave-triggers:
when: on_success
only: ['triggers']
debian8-calico:
debian8-calico-upgrade:
stage: deploy-gce-part2
<<: *job
<<: *gce
@@ -434,16 +568,35 @@ ubuntu-rkt-sep:
except: ['triggers']
only: ['master', /^pr-.*$/]
# Premoderated with manual actions
syntax-check:
ubuntu-vault-sep:
stage: deploy-gce-part1
<<: *job
stage: unit-tests
<<: *gce
variables:
<<: *gce_variables
<<: *ubuntu_vault_sep_variables
when: manual
except: ['triggers']
only: ['master', /^pr-.*$/]
# Premoderated with manual actions
ci-authorized:
<<: *job
stage: moderator
before_script:
- apt-get -y install jq
script:
- ansible-playbook -i inventory/local-tests.cfg -u root -e ansible_ssh_user=root -b --become-user=root cluster.yml -vvv --syntax-check
- /bin/sh scripts/premoderator.sh
except: ['triggers', 'master']
syntax-check:
<<: *job
stage: unit-tests
script:
- ansible-playbook -i inventory/local-tests.cfg -u root -e ansible_ssh_user=root -b --become-user=root cluster.yml -vvv --syntax-check
- ansible-playbook -i inventory/local-tests.cfg -u root -e ansible_ssh_user=root -b --become-user=root upgrade-cluster.yml -vvv --syntax-check
- ansible-playbook -i inventory/local-tests.cfg -u root -e ansible_ssh_user=root -b --become-user=root reset.yml -vvv --syntax-check
except: ['triggers', 'master']
tox-inventory-builder:
stage: unit-tests

View File

@@ -1,6 +1,6 @@
![Kubernetes Logo](https://s28.postimg.org/lf3q4ocpp/k8s.png)
##Deploy a production ready kubernetes cluster
## Deploy a production ready kubernetes cluster
If you have questions, join us on the [kubernetes slack](https://slack.k8s.io), channel **#kargo**.
@@ -55,7 +55,7 @@ Versions of supported components
[flanneld](https://github.com/coreos/flannel/releases) v0.6.2 <br>
[calicoctl](https://github.com/projectcalico/calico-docker/releases) v0.23.0 <br>
[canal](https://github.com/projectcalico/canal) (given calico/flannel versions) <br>
[weave](http://weave.works/) v1.6.1 <br>
[weave](http://weave.works/) v1.8.2 <br>
[docker](https://www.docker.com/) v1.12.5 <br>
[rkt](https://coreos.com/rkt/docs/latest/) v1.21.0 <br>
@@ -67,16 +67,18 @@ plugins can be deployed for a given single cluster.
Requirements
--------------
* **Ansible v2.2 (or newer) and python-netaddr is installed on the machine
that will run Ansible commands**
* **Jinja 2.8 (or newer) is required to run the Ansible Playbooks**
* The target servers must have **access to the Internet** in order to pull docker images.
* The target servers are configured to allow **IPv4 forwarding**.
* **Your ssh key must be copied** to all the servers part of your inventory.
* The **firewalls are not managed**, you'll need to implement your own rules the way you used to.
in order to avoid any issue during deployment you should disable your firewall.
* The target servers are configured to allow **IPv4 forwarding**.
* **Copy your ssh keys** to all the servers part of your inventory.
* **Ansible v2.2 (or newer) and python-netaddr**
## Network plugins
You can choose between 4 network plugins. (default: `flannel` with vxlan backend)
You can choose between 4 network plugins. (default: `calico`)
* [**flannel**](docs/flannel.md): gre/vxlan (layer 2) networking.
@@ -91,11 +93,23 @@ The choice is defined with the variable `kube_network_plugin`. There is also an
option to leverage built-in cloud provider networking instead.
See also [Network checker](docs/netcheck.md).
## Community docs and resources
- [kubernetes.io/docs/getting-started-guides/kargo/](https://kubernetes.io/docs/getting-started-guides/kargo/)
- [kargo, monitoring and logging](https://github.com/gregbkr/kubernetes-kargo-logging-monitoring) by @gregbkr
- [Deploy Kubernetes w/ Ansible & Terraform](https://rsmitty.github.io/Terraform-Ansible-Kubernetes/) by @rsmitty
- [Deploy a Kubernets Cluster with Kargo (video)](https://www.youtube.com/watch?v=N9q51JgbWu8)
## Tools and projects on top of Kargo
- [Digital Rebar](https://github.com/digitalrebar/digitalrebar)
- [Kargo-cli](https://github.com/kubespray/kargo-cli)
- [Fuel-ccp-installer](https://github.com/openstack/fuel-ccp-installer)
- [Terraform Contrib](https://github.com/kubernetes-incubator/kargo/tree/master/contrib/terraform)
## CI Tests
![Gitlab Logo](https://s27.postimg.org/wmtaig1wz/gitlabci.png)
[![Build graphs](https://gitlab.com/kargo-ci/kubernetes-incubator__kargo/badges/master/build.svg)](https://gitlab.com/kargo-ci/kubernetes-incubator__kargo/pipelines) </br>
CI/end-to-end tests sponsored by Google (GCE), and [teuto.net](https://teuto.net/) for OpenStack.
CI/end-to-end tests sponsored by Google (GCE), DigitalOcean, [teuto.net](https://teuto.net/) (openstack).
See the [test matrix](docs/test_cases.md) for details.

View File

@@ -7,3 +7,34 @@ The Kargo Project is released on an as-needed basis. The process is as follows:
3. An OWNER runs `git tag -s $VERSION` and inserts the changelog and pushes the tag with `git push $VERSION`
4. The release issue is closed
5. An announcement email is sent to `kubernetes-dev@googlegroups.com` with the subject `[ANNOUNCE] kargo $VERSION is released`
## Major/minor releases, merge freezes and milestones
* Kargo does not maintain stable branches for releases. Releases are tags, not
branches, and there are no backports. Therefore, there is no need for merge
freezes as well.
* Fixes for major releases (vX.x.0) and minor releases (vX.Y.x) are delivered
via maintenance releases (vX.Y.Z) and assigned to the corresponding open
milestone (vX.Y). That milestone remains open for the major/minor releases
support lifetime, which ends once the milestone closed. Then only a next major
or minor release can be done.
* Kargo major and minor releases are bound to the given ``kube_version`` major/minor
version numbers and other components' arbitrary versions, like etcd or network plugins.
Older or newer versions are not supported and not tested for the given release.
* There is no unstable releases and no APIs, thus Kargo doesn't follow
[semver](http://semver.org/). Every version describes only a stable release.
Breaking changes, if any introduced by changed defaults or non-contrib ansible roles'
playbooks, shall be described in the release notes. Other breaking changes, if any in
the contributed addons or bound versions of Kubernetes and other components, are
considered out of Kargo scope and are up to the components' teams to deal with and
document.
* Minor releases can change components' versions, but not the major ``kube_version``.
Greater ``kube_version`` requires a new major or minor release. For example, if Kargo v2.0.0
is bound to ``kube_version: 1.4.x``, ``calico_version: 0.22.0``, ``etcd_version: v3.0.6``,
then Kargo v2.1.0 may be bound to only minor changes to ``kube_version``, like v1.5.1
and *any* changes to other components, like etcd v4, or calico 1.2.3.
And Kargo v3.x.x shall be bound to ``kube_version: 2.x.x`` respectively.

30
Vagrantfile vendored
View File

@@ -17,6 +17,13 @@ $shared_folders = {}
$forwarded_ports = {}
$subnet = "172.17.8"
$box = "bento/ubuntu-16.04"
# The first three nodes are etcd servers
$etcd_instances = $num_instances
# The first two nodes are masters
$kube_master_instances = $num_instances == 1 ? $num_instances : ($num_instances - 1)
# All nodes are kube nodes
$kube_node_instances = $num_instances
$local_release_dir = "/vagrant/temp"
host_vars = {}
@@ -88,12 +95,14 @@ Vagrant.configure("2") do |config|
ip = "#{$subnet}.#{i+100}"
host_vars[vm_name] = {
"ip" => ip,
#"access_ip" => ip,
"flannel_interface" => ip,
"flannel_backend_type" => "host-gw",
"local_release_dir" => "/vagrant/temp",
"download_run_once" => "False"
"ip": ip,
"flannel_interface": ip,
"flannel_backend_type": "host-gw",
"local_release_dir" => $local_release_dir,
"download_run_once": "False",
# Override the default 'calico' with flannel.
# inventory/group_vars/k8s-cluster.yml
"kube_network_plugin": "flannel",
}
config.vm.network :private_network, ip: ip
@@ -112,12 +121,9 @@ Vagrant.configure("2") do |config|
ansible.host_vars = host_vars
#ansible.tags = ['download']
ansible.groups = {
# The first three nodes should be etcd servers
"etcd" => ["#{$instance_name_prefix}-0[1:3]"],
# The first two nodes should be masters
"kube-master" => ["#{$instance_name_prefix}-0[1:2]"],
# all nodes should be kube nodes
"kube-node" => ["#{$instance_name_prefix}-0[1:#{$num_instances}]"],
"etcd" => ["#{$instance_name_prefix}-0[1:#{$etcd_instances}]"],
"kube-master" => ["#{$instance_name_prefix}-0[1:#{$kube_master_instances}]"],
"kube-node" => ["#{$instance_name_prefix}-0[1:#{$kube_node_instances}]"],
"k8s-cluster:children" => ["kube-master", "kube-node"],
}
end

View File

@@ -7,3 +7,6 @@ host_key_checking=False
gathering = smart
fact_caching = jsonfile
fact_caching_connection = /tmp
stdout_callback = skippy
library = ./library
callback_whitelist = profile_tasks

View File

@@ -2,66 +2,91 @@
- hosts: localhost
gather_facts: False
roles:
- bastion-ssh-config
tags: [localhost, bastion]
- { role: kargo-defaults}
- { role: bastion-ssh-config, tags: ["localhost", "bastion"]}
- hosts: k8s-cluster:etcd:calico-rr
any_errors_fatal: true
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
gather_facts: false
vars:
# Need to disable pipelining for bootstrap-os as some systems have requiretty in sudoers set, which makes pipelining
# fail. bootstrap-os fixes this on these systems, so in later plays it can be enabled.
ansible_ssh_pipelining: false
roles:
- bootstrap-os
tags:
- bootstrap-os
- { role: kargo-defaults}
- { role: bootstrap-os, tags: bootstrap-os}
- hosts: k8s-cluster:etcd:calico-rr
any_errors_fatal: true
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
vars:
ansible_ssh_pipelining: true
gather_facts: true
- hosts: k8s-cluster:etcd:calico-rr
any_errors_fatal: true
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
roles:
- { role: kargo-defaults}
- { role: kernel-upgrade, tags: kernel-upgrade, when: kernel_upgrade is defined and kernel_upgrade }
- { role: kubernetes/preinstall, tags: preinstall }
- { role: docker, tags: docker }
- { role: rkt, tags: rkt, when: "'rkt' in [ etcd_deployment_type, kubelet_deployment_type ]" }
- role: rkt
tags: rkt
when: "'rkt' in [etcd_deployment_type, kubelet_deployment_type, vault_deployment_type]"
- hosts: etcd:!k8s-cluster
any_errors_fatal: true
- hosts: etcd:k8s-cluster:vault
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
roles:
- { role: etcd, tags: etcd }
- { role: kargo-defaults, when: "cert_management == 'vault'" }
- { role: vault, tags: vault, vault_bootstrap: true, when: "cert_management == 'vault'" }
- hosts: etcd
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
roles:
- { role: kargo-defaults}
- { role: etcd, tags: etcd, etcd_cluster_setup: true }
- hosts: k8s-cluster
any_errors_fatal: true
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
roles:
- { role: etcd, tags: etcd }
- { role: kargo-defaults}
- { role: etcd, tags: etcd, etcd_cluster_setup: false }
- hosts: etcd:k8s-cluster:vault
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
roles:
- { role: kargo-defaults}
- { role: vault, tags: vault, when: "cert_management == 'vault'"}
- hosts: k8s-cluster
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
roles:
- { role: kargo-defaults}
- { role: kubernetes/node, tags: node }
- { role: network_plugin, tags: network }
- hosts: kube-master
any_errors_fatal: true
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
roles:
- { role: kargo-defaults}
- { role: kubernetes/master, tags: master }
- { role: kubernetes-apps/lib, tags: apps }
- { role: kubernetes-apps/network_plugin, tags: network }
- { role: kubernetes-apps/policy_controller, tags: policy-controller }
- hosts: calico-rr
any_errors_fatal: true
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
roles:
- { role: kargo-defaults}
- { role: network_plugin/calico/rr, tags: network }
- hosts: k8s-cluster
any_errors_fatal: true
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
roles:
- { role: kargo-defaults}
- { role: dnsmasq, when: "dns_mode == 'dnsmasq_kubedns'", tags: dnsmasq }
- { role: kubernetes/preinstall, when: "dns_mode != 'none' and resolvconf_mode == 'host_resolvconf'", tags: resolvconf }
- hosts: kube-master[0]
any_errors_fatal: true
any_errors_fatal: "{{ any_errors_fatal | default(true) }}"
roles:
- { role: kubernetes-apps/lib, tags: apps }
- { role: kargo-defaults}
- { role: kubernetes-apps, tags: apps }

View File

@@ -0,0 +1,27 @@
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["ec2:*"],
"Resource": ["*"]
},
{
"Effect": "Allow",
"Action": ["elasticloadbalancing:*"],
"Resource": ["*"]
},
{
"Effect": "Allow",
"Action": ["route53:*"],
"Resource": ["*"]
},
{
"Effect": "Allow",
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::kubernetes-*"
]
}
]
}

View File

@@ -0,0 +1,10 @@
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": { "Service": "ec2.amazonaws.com"},
"Action": "sts:AssumeRole"
}
]
}

View File

@@ -0,0 +1,45 @@
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::kubernetes-*"
]
},
{
"Effect": "Allow",
"Action": "ec2:Describe*",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "ec2:AttachVolume",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "ec2:DetachVolume",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": ["route53:*"],
"Resource": ["*"]
},
{
"Effect": "Allow",
"Action": [
"ecr:GetAuthorizationToken",
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:GetRepositoryPolicy",
"ecr:DescribeRepositories",
"ecr:ListImages",
"ecr:BatchGetImage"
],
"Resource": "*"
}
]
}

View File

@@ -0,0 +1,10 @@
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": { "Service": "ec2.amazonaws.com"},
"Action": "sts:AssumeRole"
}
]
}

View File

@@ -40,7 +40,8 @@ import os
import re
import sys
ROLES = ['kube-master', 'all', 'k8s-cluster:children', 'kube-node', 'etcd']
ROLES = ['all', 'kube-master', 'kube-node', 'etcd', 'k8s-cluster:children',
'calico-rr']
PROTECTED_NAMES = ROLES
AVAILABLE_COMMANDS = ['help', 'print_cfg', 'print_ips', 'load']
_boolean_states = {'1': True, 'yes': True, 'true': True, 'on': True,
@@ -51,10 +52,18 @@ def get_var_as_bool(name, default):
value = os.environ.get(name, '')
return _boolean_states.get(value.lower(), default)
# Configurable as shell vars start
CONFIG_FILE = os.environ.get("CONFIG_FILE", "./inventory.cfg")
# Reconfigures cluster distribution at scale
SCALE_THRESHOLD = int(os.environ.get("SCALE_THRESHOLD", 50))
MASSIVE_SCALE_THRESHOLD = int(os.environ.get("SCALE_THRESHOLD", 200))
DEBUG = get_var_as_bool("DEBUG", True)
HOST_PREFIX = os.environ.get("HOST_PREFIX", "node")
# Configurable as shell vars end
class KargoInventory(object):
@@ -74,11 +83,16 @@ class KargoInventory(object):
if changed_hosts:
self.hosts = self.build_hostnames(changed_hosts)
self.purge_invalid_hosts(self.hosts.keys(), PROTECTED_NAMES)
self.set_kube_master(list(self.hosts.keys())[:2])
self.set_all(self.hosts)
self.set_k8s_cluster()
self.set_kube_node(self.hosts.keys())
self.set_etcd(list(self.hosts.keys())[:3])
if len(self.hosts) >= SCALE_THRESHOLD:
self.set_kube_master(list(self.hosts.keys())[3:5])
else:
self.set_kube_master(list(self.hosts.keys())[:2])
self.set_kube_node(self.hosts.keys())
if len(self.hosts) >= SCALE_THRESHOLD:
self.set_calico_rr(list(self.hosts.keys())[:3])
else: # Show help if no options
self.show_help()
sys.exit(0)
@@ -205,8 +219,32 @@ class KargoInventory(object):
self.add_host_to_group('k8s-cluster:children', 'kube-node')
self.add_host_to_group('k8s-cluster:children', 'kube-master')
def set_calico_rr(self, hosts):
for host in hosts:
if host in self.config.items('kube-master'):
self.debug("Not adding {0} to calico-rr group because it "
"conflicts with kube-master group".format(host))
continue
if host in self.config.items('kube-node'):
self.debug("Not adding {0} to calico-rr group because it "
"conflicts with kube-node group".format(host))
continue
self.add_host_to_group('calico-rr', host)
def set_kube_node(self, hosts):
for host in hosts:
if len(self.config['all']) >= SCALE_THRESHOLD:
if self.config.has_option('etcd', host):
self.debug("Not adding {0} to kube-node group because of "
"scale deployment and host is in etcd "
"group.".format(host))
continue
if len(self.config['all']) >= MASSIVE_SCALE_THRESHOLD:
if self.config.has_option('kube-master', host):
self.debug("Not adding {0} to kube-node group because of "
"scale deployment and host is in kube-master "
"group.".format(host))
continue
self.add_host_to_group('kube-node', host)
def set_etcd(self, hosts):
@@ -275,7 +313,15 @@ print_ips - Write a space-delimited list of IPs from "all" group
Advanced usage:
Add another host after initial creation: inventory.py 10.10.1.5
Delete a host: inventory.py -10.10.1.3
Delete a host by id: inventory.py -node1'''
Delete a host by id: inventory.py -node1
Configurable env vars:
DEBUG Enable debug printing. Default: True
CONFIG_FILE File to write config to Default: ./inventory.cfg
HOST_PREFIX Host prefix for generated hosts. Default: node
SCALE_THRESHOLD Separate ETCD role if # of nodes >= 50
MASSIVE_SCALE_THRESHOLD Separate K8s master and ETCD if # of nodes >= 200
'''
print(help_text)
def print_config(self):

View File

@@ -1,48 +0,0 @@
---
- src: https://gitlab.com/kubespray-ansibl8s/k8s-common.git
path: roles/apps
scm: git
#- src: https://gitlab.com/kubespray-ansibl8s/k8s-dashboard.git
# path: roles/apps
# scm: git
#
#- src: https://gitlab.com/kubespray-ansibl8s/k8s-kubedns.git
# path: roles/apps
# scm: git
#
#- src: https://gitlab.com/kubespray-ansibl8s/k8s-elasticsearch.git
# path: roles/apps
# scm: git
#
#- src: https://gitlab.com/kubespray-ansibl8s/k8s-redis.git
# path: roles/apps
# scm: git
#
#- src: https://gitlab.com/kubespray-ansibl8s/k8s-memcached.git
# path: roles/apps
# scm: git
#
#- src: https://gitlab.com/kubespray-ansibl8s/k8s-postgres.git
# path: roles/apps
# scm: git
#
#- src: https://gitlab.com/kubespray-ansibl8s/k8s-pgbouncer.git
# path: roles/apps
# scm: git
#
#- src: https://gitlab.com/kubespray-ansibl8s/k8s-heapster.git
# path: roles/apps
# scm: git
#
#- src: https://gitlab.com/kubespray-ansibl8s/k8s-influxdb.git
# path: roles/apps
# scm: git
#
#- src: https://gitlab.com/kubespray-ansibl8s/k8s-kubedash.git
# path: roles/apps
# scm: git
#
#- src: https://gitlab.com/kubespray-ansibl8s/k8s-kube-logstash.git
# path: roles/apps
# scm: git

View File

@@ -210,3 +210,31 @@ class TestInventory(unittest.TestCase):
self.inv.set_etcd([host])
self.assertTrue(host in self.inv.config[group])
def test_scale_scenario_one(self):
num_nodes = 50
hosts = OrderedDict()
for hostid in range(1, num_nodes+1):
hosts["node" + str(hostid)] = ""
self.inv.set_all(hosts)
self.inv.set_etcd(hosts.keys()[0:3])
self.inv.set_kube_master(hosts.keys()[0:2])
self.inv.set_kube_node(hosts.keys())
for h in range(3):
self.assertFalse(hosts.keys()[h] in self.inv.config['kube-node'])
def test_scale_scenario_two(self):
num_nodes = 500
hosts = OrderedDict()
for hostid in range(1, num_nodes+1):
hosts["node" + str(hostid)] = ""
self.inv.set_all(hosts)
self.inv.set_etcd(hosts.keys()[0:3])
self.inv.set_kube_master(hosts.keys()[3:5])
self.inv.set_kube_node(hosts.keys())
for h in range(5):
self.assertFalse(hosts.keys()[h] in self.inv.config['kube-node'])

View File

@@ -11,7 +11,7 @@ deps =
-r{toxinidir}/test-requirements.txt
setenv = VIRTUAL_ENV={envdir}
passenv = http_proxy HTTP_PROXY https_proxy HTTPS_PROXY no_proxy NO_PROXY
commands = py.test -vv #{posargs:./tests}
commands = pytest -vv #{posargs:./tests}
[testenv:pep8]
usedevelop = False

View File

@@ -0,0 +1,11 @@
# Kargo on KVM Virtual Machines hypervisor preparation
A simple playbook to ensure your system has the right settings to enable Kargo
deployment on VMs.
This playbook does not create Virtual Machines, nor does it run Kargo itself.
### User creation
If you want to create a user for running Kargo deployment, you should specify
both `k8s_deployment_user` and `k8s_deployment_user_pkey_path`.

View File

@@ -0,0 +1,3 @@
#k8s_deployment_user: kargo
#k8s_deployment_user_pkey_path: /tmp/ssh_rsa

View File

@@ -0,0 +1,8 @@
---
- hosts: localhost
gather_facts: False
become: yes
vars:
- bootstrap_os: none
roles:
- kvm-setup

View File

@@ -0,0 +1,46 @@
---
- name: Upgrade all packages to the latest version (yum)
yum:
name: '*'
state: latest
when: ansible_os_family == "RedHat"
- name: Install required packages
yum:
name: "{{ item }}"
state: latest
with_items:
- bind-utils
- ntp
when: ansible_os_family == "RedHat"
- name: Install required packages
apt:
upgrade: yes
update_cache: yes
cache_valid_time: 3600
name: "{{ item }}"
state: latest
install_recommends: no
with_items:
- dnsutils
- ntp
when: ansible_os_family == "Debian"
- name: Upgrade all packages to the latest version (apt)
shell: apt-get -o \
Dpkg::Options::=--force-confdef -o \
Dpkg::Options::=--force-confold -q -y \
dist-upgrade
environment:
DEBIAN_FRONTEND: noninteractive
when: ansible_os_family == "Debian"
# Create deployment user if required
- include: user.yml
when: k8s_deployment_user is defined
# Set proper sysctl values
- include: sysctl.yml

View File

@@ -0,0 +1,46 @@
---
- name: Load br_netfilter module
modprobe:
name: br_netfilter
state: present
register: br_netfilter
- name: Add br_netfilter into /etc/modules
lineinfile:
dest: /etc/modules
state: present
line: 'br_netfilter'
when: br_netfilter is defined and ansible_os_family == 'Debian'
- name: Add br_netfilter into /etc/modules-load.d/kargo.conf
copy:
dest: /etc/modules-load.d/kargo.conf
content: |-
### This file is managed by Ansible
br-netfilter
owner: root
group: root
mode: 0644
when: br_netfilter is defined
- name: Enable net.ipv4.ip_forward in sysctl
sysctl:
name: net.ipv4.ip_forward
value: 1
sysctl_file: /etc/sysctl.d/ipv4-ip_forward.conf
state: present
reload: yes
- name: Set bridge-nf-call-{arptables,iptables} to 0
sysctl:
name: "{{ item }}"
state: present
value: 0
sysctl_file: /etc/sysctl.d/bridge-nf-call.conf
reload: yes
with_items:
- net.bridge.bridge-nf-call-arptables
- net.bridge.bridge-nf-call-ip6tables
- net.bridge.bridge-nf-call-iptables
when: br_netfilter is defined

View File

@@ -0,0 +1,46 @@
---
- name: Create user {{ k8s_deployment_user }}
user:
name: "{{ k8s_deployment_user }}"
groups: adm
shell: /bin/bash
- name: Ensure that .ssh exists
file:
path: "/home/{{ k8s_deployment_user }}/.ssh"
state: directory
owner: "{{ k8s_deployment_user }}"
group: "{{ k8s_deployment_user }}"
- name: Configure sudo for deployment user
copy:
content: |
%{{ k8s_deployment_user }} ALL=(ALL) NOPASSWD: ALL
dest: "/etc/sudoers.d/55-k8s-deployment"
owner: root
group: root
mode: 0644
- name: Write private SSH key
copy:
src: "{{ k8s_deployment_user_pkey_path }}"
dest: "/home/{{ k8s_deployment_user }}/.ssh/id_rsa"
mode: 0400
owner: "{{ k8s_deployment_user }}"
group: "{{ k8s_deployment_user }}"
when: k8s_deployment_user_pkey_path is defined
- name: Write public SSH key
shell: "ssh-keygen -y -f /home/{{ k8s_deployment_user }}/.ssh/id_rsa \
> /home/{{ k8s_deployment_user }}/.ssh/authorized_keys"
args:
creates: "/home/{{ k8s_deployment_user }}/.ssh/authorized_keys"
when: k8s_deployment_user_pkey_path is defined
- name: Fix ssh-pub-key permissions
file:
path: "/home/{{ k8s_deployment_user }}/.ssh/authorized_keys"
mode: 0600
owner: "{{ k8s_deployment_user }}"
group: "{{ k8s_deployment_user }}"
when: k8s_deployment_user_pkey_path is defined

View File

@@ -1 +0,0 @@
../../../../../roles/kubernetes-apps/lib

View File

@@ -1,2 +1,2 @@
*.tfstate*
inventory
.terraform

View File

@@ -1,261 +0,0 @@
variable "deploymentName" {
type = "string"
description = "The desired name of your deployment."
}
variable "numControllers"{
type = "string"
description = "Desired # of controllers."
}
variable "numEtcd" {
type = "string"
description = "Desired # of etcd nodes. Should be an odd number."
}
variable "numNodes" {
type = "string"
description = "Desired # of nodes."
}
variable "volSizeController" {
type = "string"
description = "Volume size for the controllers (GB)."
}
variable "volSizeEtcd" {
type = "string"
description = "Volume size for etcd (GB)."
}
variable "volSizeNodes" {
type = "string"
description = "Volume size for nodes (GB)."
}
variable "subnet" {
type = "string"
description = "The subnet in which to put your cluster."
}
variable "securityGroups" {
type = "string"
description = "The sec. groups in which to put your cluster."
}
variable "ami"{
type = "string"
description = "AMI to use for all VMs in cluster."
}
variable "SSHKey" {
type = "string"
description = "SSH key to use for VMs."
}
variable "master_instance_type" {
type = "string"
description = "Size of VM to use for masters."
}
variable "etcd_instance_type" {
type = "string"
description = "Size of VM to use for etcd."
}
variable "node_instance_type" {
type = "string"
description = "Size of VM to use for nodes."
}
variable "terminate_protect" {
type = "string"
default = "false"
}
variable "awsRegion" {
type = "string"
}
provider "aws" {
region = "${var.awsRegion}"
}
variable "iam_prefix" {
type = "string"
description = "Prefix name for IAM profiles"
}
resource "aws_iam_instance_profile" "kubernetes_master_profile" {
name = "${var.iam_prefix}_kubernetes_master_profile"
roles = ["${aws_iam_role.kubernetes_master_role.name}"]
}
resource "aws_iam_role" "kubernetes_master_role" {
name = "${var.iam_prefix}_kubernetes_master_role"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": { "Service": "ec2.amazonaws.com"},
"Action": "sts:AssumeRole"
}
]
}
EOF
}
resource "aws_iam_role_policy" "kubernetes_master_policy" {
name = "${var.iam_prefix}_kubernetes_master_policy"
role = "${aws_iam_role.kubernetes_master_role.id}"
policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["ec2:*"],
"Resource": ["*"]
},
{
"Effect": "Allow",
"Action": ["elasticloadbalancing:*"],
"Resource": ["*"]
},
{
"Effect": "Allow",
"Action": "s3:*",
"Resource": "*"
}
]
}
EOF
}
resource "aws_iam_instance_profile" "kubernetes_node_profile" {
name = "${var.iam_prefix}_kubernetes_node_profile"
roles = ["${aws_iam_role.kubernetes_node_role.name}"]
}
resource "aws_iam_role" "kubernetes_node_role" {
name = "${var.iam_prefix}_kubernetes_node_role"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Principal": { "Service": "ec2.amazonaws.com"},
"Action": "sts:AssumeRole"
}
]
}
EOF
}
resource "aws_iam_role_policy" "kubernetes_node_policy" {
name = "${var.iam_prefix}_kubernetes_node_policy"
role = "${aws_iam_role.kubernetes_node_role.id}"
policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "s3:*",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "ec2:Describe*",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "ec2:AttachVolume",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "ec2:DetachVolume",
"Resource": "*"
}
]
}
EOF
}
resource "aws_instance" "master" {
count = "${var.numControllers}"
ami = "${var.ami}"
instance_type = "${var.master_instance_type}"
subnet_id = "${var.subnet}"
vpc_security_group_ids = ["${var.securityGroups}"]
key_name = "${var.SSHKey}"
disable_api_termination = "${var.terminate_protect}"
iam_instance_profile = "${aws_iam_instance_profile.kubernetes_master_profile.id}"
root_block_device {
volume_size = "${var.volSizeController}"
}
tags {
Name = "${var.deploymentName}-master-${count.index + 1}"
}
}
resource "aws_instance" "etcd" {
count = "${var.numEtcd}"
ami = "${var.ami}"
instance_type = "${var.etcd_instance_type}"
subnet_id = "${var.subnet}"
vpc_security_group_ids = ["${var.securityGroups}"]
key_name = "${var.SSHKey}"
disable_api_termination = "${var.terminate_protect}"
root_block_device {
volume_size = "${var.volSizeEtcd}"
}
tags {
Name = "${var.deploymentName}-etcd-${count.index + 1}"
}
}
resource "aws_instance" "minion" {
count = "${var.numNodes}"
ami = "${var.ami}"
instance_type = "${var.node_instance_type}"
subnet_id = "${var.subnet}"
vpc_security_group_ids = ["${var.securityGroups}"]
key_name = "${var.SSHKey}"
disable_api_termination = "${var.terminate_protect}"
iam_instance_profile = "${aws_iam_instance_profile.kubernetes_node_profile.id}"
root_block_device {
volume_size = "${var.volSizeNodes}"
}
tags {
Name = "${var.deploymentName}-minion-${count.index + 1}"
}
}
output "kubernetes_master_profile" {
value = "${aws_iam_instance_profile.kubernetes_master_profile.id}"
}
output "kubernetes_node_profile" {
value = "${aws_iam_instance_profile.kubernetes_node_profile.id}"
}
output "master-ip" {
value = "${join(", ", aws_instance.master.*.private_ip)}"
}
output "etcd-ip" {
value = "${join(", ", aws_instance.etcd.*.private_ip)}"
}
output "minion-ip" {
value = "${join(", ", aws_instance.minion.*.private_ip)}"
}

View File

@@ -1,37 +0,0 @@
variable "SSHUser" {
type = "string"
description = "SSH User for VMs."
}
resource "null_resource" "ansible-provision" {
depends_on = ["aws_instance.master","aws_instance.etcd","aws_instance.minion"]
##Create Master Inventory
provisioner "local-exec" {
command = "echo \"[kube-master]\" > inventory"
}
provisioner "local-exec" {
command = "echo \"${join("\n",formatlist("%s ansible_ssh_user=%s", aws_instance.master.*.private_ip, var.SSHUser))}\" >> inventory"
}
##Create ETCD Inventory
provisioner "local-exec" {
command = "echo \"\n[etcd]\" >> inventory"
}
provisioner "local-exec" {
command = "echo \"${join("\n",formatlist("%s ansible_ssh_user=%s", aws_instance.etcd.*.private_ip, var.SSHUser))}\" >> inventory"
}
##Create Nodes Inventory
provisioner "local-exec" {
command = "echo \"\n[kube-node]\" >> inventory"
}
provisioner "local-exec" {
command = "echo \"${join("\n",formatlist("%s ansible_ssh_user=%s", aws_instance.minion.*.private_ip, var.SSHUser))}\" >> inventory"
}
provisioner "local-exec" {
command = "echo \"\n[k8s-cluster:children]\nkube-node\nkube-master\" >> inventory"
}
}

View File

@@ -2,27 +2,34 @@
**Overview:**
- This will create nodes in a VPC inside of AWS
This project will create:
* VPC with Public and Private Subnets in # Availability Zones
* Bastion Hosts and NAT Gateways in the Public Subnet
* A dynamic number of masters, etcd, and worker nodes in the Private Subnet
* even distributed over the # of Availability Zones
* AWS ELB in the Public Subnet for accessing the Kubernetes API from the internet
- A dynamic number of masters, etcd, and nodes can be created
- These scripts currently expect Private IP connectivity with the nodes that are created. This means that you may need a tunnel to your VPC or to run these scripts from a VM inside the VPC. Will be looking into how to work around this later.
**Requirements**
- Terraform 0.8.7 or newer
**How to Use:**
- Export the variables for your Amazon credentials:
- Export the variables for your AWS credentials or edit credentials.tfvars:
```
export AWS_ACCESS_KEY_ID="xxx"
export AWS_SECRET_ACCESS_KEY="yyy"
export aws_access_key="xxx"
export aws_secret_key="yyy"
export aws_ssh_key_name="zzz"
```
- Update contrib/terraform/aws/terraform.tfvars with your data
- Run with `terraform apply`
- Run with `terraform apply -var-file="credentials.tfvars"` or `terraform apply` depending if you exported your AWS credentials
- Once the infrastructure is created, you can run the kubespray playbooks and supply contrib/terraform/aws/inventory with the `-i` flag.
- Once the infrastructure is created, you can run the kargo playbooks and supply inventory/hosts with the `-i` flag.
**Future Work:**
**Architecture**
- Update the inventory creation file to be something a little more reasonable. It's just a local-exec from Terraform now, using terraform.py or something may make sense in the future.
Pictured is an AWS Infrastructure created with this Terraform project distributed over two Availability Zones.
![AWS Infrastructure with Terraform ](docs/aws_kargo.png)

View File

@@ -0,0 +1,185 @@
terraform {
required_version = ">= 0.8.7"
}
provider "aws" {
access_key = "${var.AWS_ACCESS_KEY_ID}"
secret_key = "${var.AWS_SECRET_ACCESS_KEY}"
region = "${var.AWS_DEFAULT_REGION}"
}
/*
* Calling modules who create the initial AWS VPC / AWS ELB
* and AWS IAM Roles for Kubernetes Deployment
*/
module "aws-vpc" {
source = "modules/vpc"
aws_cluster_name = "${var.aws_cluster_name}"
aws_vpc_cidr_block = "${var.aws_vpc_cidr_block}"
aws_avail_zones="${var.aws_avail_zones}"
aws_cidr_subnets_private="${var.aws_cidr_subnets_private}"
aws_cidr_subnets_public="${var.aws_cidr_subnets_public}"
}
module "aws-elb" {
source = "modules/elb"
aws_cluster_name="${var.aws_cluster_name}"
aws_vpc_id="${module.aws-vpc.aws_vpc_id}"
aws_avail_zones="${var.aws_avail_zones}"
aws_subnet_ids_public="${module.aws-vpc.aws_subnet_ids_public}"
aws_elb_api_port = "${var.aws_elb_api_port}"
k8s_secure_api_port = "${var.k8s_secure_api_port}"
}
module "aws-iam" {
source = "modules/iam"
aws_cluster_name="${var.aws_cluster_name}"
}
/*
* Create Bastion Instances in AWS
*
*/
resource "aws_instance" "bastion-server" {
ami = "${var.aws_bastion_ami}"
instance_type = "${var.aws_bastion_size}"
count = "${length(var.aws_cidr_subnets_public)}"
associate_public_ip_address = true
availability_zone = "${element(var.aws_avail_zones,count.index)}"
subnet_id = "${element(module.aws-vpc.aws_subnet_ids_public,count.index)}"
vpc_security_group_ids = [ "${module.aws-vpc.aws_security_group}" ]
key_name = "${var.AWS_SSH_KEY_NAME}"
tags {
Name = "kubernetes-${var.aws_cluster_name}-bastion-${count.index}"
Cluster = "${var.aws_cluster_name}"
Role = "bastion-${var.aws_cluster_name}-${count.index}"
}
}
/*
* Create K8s Master and worker nodes and etcd instances
*
*/
resource "aws_instance" "k8s-master" {
ami = "${var.aws_cluster_ami}"
instance_type = "${var.aws_kube_master_size}"
count = "${var.aws_kube_master_num}"
availability_zone = "${element(var.aws_avail_zones,count.index)}"
subnet_id = "${element(module.aws-vpc.aws_subnet_ids_private,count.index)}"
vpc_security_group_ids = [ "${module.aws-vpc.aws_security_group}" ]
iam_instance_profile = "${module.aws-iam.kube-master-profile}"
key_name = "${var.AWS_SSH_KEY_NAME}"
tags {
Name = "kubernetes-${var.aws_cluster_name}-master${count.index}"
Cluster = "${var.aws_cluster_name}"
Role = "master"
}
}
resource "aws_elb_attachment" "attach_master_nodes" {
count = "${var.aws_kube_master_num}"
elb = "${module.aws-elb.aws_elb_api_id}"
instance = "${element(aws_instance.k8s-master.*.id,count.index)}"
}
resource "aws_instance" "k8s-etcd" {
ami = "${var.aws_cluster_ami}"
instance_type = "${var.aws_etcd_size}"
count = "${var.aws_etcd_num}"
availability_zone = "${element(var.aws_avail_zones,count.index)}"
subnet_id = "${element(module.aws-vpc.aws_subnet_ids_private,count.index)}"
vpc_security_group_ids = [ "${module.aws-vpc.aws_security_group}" ]
key_name = "${var.AWS_SSH_KEY_NAME}"
tags {
Name = "kubernetes-${var.aws_cluster_name}-etcd${count.index}"
Cluster = "${var.aws_cluster_name}"
Role = "etcd"
}
}
resource "aws_instance" "k8s-worker" {
ami = "${var.aws_cluster_ami}"
instance_type = "${var.aws_kube_worker_size}"
count = "${var.aws_kube_worker_num}"
availability_zone = "${element(var.aws_avail_zones,count.index)}"
subnet_id = "${element(module.aws-vpc.aws_subnet_ids_private,count.index)}"
vpc_security_group_ids = [ "${module.aws-vpc.aws_security_group}" ]
iam_instance_profile = "${module.aws-iam.kube-worker-profile}"
key_name = "${var.AWS_SSH_KEY_NAME}"
tags {
Name = "kubernetes-${var.aws_cluster_name}-worker${count.index}"
Cluster = "${var.aws_cluster_name}"
Role = "worker"
}
}
/*
* Create Kargo Inventory File
*
*/
data "template_file" "inventory" {
template = "${file("${path.module}/templates/inventory.tpl")}"
vars {
public_ip_address_bastion = "${join("\n",formatlist("bastion ansible_ssh_host=%s" , aws_instance.bastion-server.*.public_ip))}"
connection_strings_master = "${join("\n",formatlist("%s ansible_ssh_host=%s",aws_instance.k8s-master.*.tags.Name, aws_instance.k8s-master.*.private_ip))}"
connection_strings_node = "${join("\n", formatlist("%s ansible_ssh_host=%s", aws_instance.k8s-worker.*.tags.Name, aws_instance.k8s-worker.*.private_ip))}"
connection_strings_etcd = "${join("\n",formatlist("%s ansible_ssh_host=%s", aws_instance.k8s-etcd.*.tags.Name, aws_instance.k8s-etcd.*.private_ip))}"
list_master = "${join("\n",aws_instance.k8s-master.*.tags.Name)}"
list_node = "${join("\n",aws_instance.k8s-worker.*.tags.Name)}"
list_etcd = "${join("\n",aws_instance.k8s-etcd.*.tags.Name)}"
elb_api_fqdn = "apiserver_loadbalancer_domain_name=\"${module.aws-elb.aws_elb_api_fqdn}\""
elb_api_port = "loadbalancer_apiserver.port=${var.aws_elb_api_port}"
}
}
resource "null_resource" "inventories" {
provisioner "local-exec" {
command = "echo '${data.template_file.inventory.rendered}' > ../../../inventory/hosts"
}
}

View File

@@ -0,0 +1,8 @@
#AWS Access Key
AWS_ACCESS_KEY_ID = ""
#AWS Secret Key
AWS_SECRET_ACCESS_KEY = ""
#EC2 SSH Key Name
AWS_SSH_KEY_NAME = ""
#AWS Region
AWS_DEFAULT_REGION = "eu-central-1"

Binary file not shown.

After

Width:  |  Height:  |  Size: 114 KiB

View File

@@ -0,0 +1,58 @@
resource "aws_security_group" "aws-elb" {
name = "kubernetes-${var.aws_cluster_name}-securitygroup-elb"
vpc_id = "${var.aws_vpc_id}"
tags {
Name = "kubernetes-${var.aws_cluster_name}-securitygroup-elb"
}
}
resource "aws_security_group_rule" "aws-allow-api-access" {
type = "ingress"
from_port = "${var.aws_elb_api_port}"
to_port = "${var.k8s_secure_api_port}"
protocol = "TCP"
cidr_blocks = ["0.0.0.0/0"]
security_group_id = "${aws_security_group.aws-elb.id}"
}
resource "aws_security_group_rule" "aws-allow-api-egress" {
type = "egress"
from_port = 0
to_port = 65535
protocol = "TCP"
cidr_blocks = ["0.0.0.0/0"]
security_group_id = "${aws_security_group.aws-elb.id}"
}
# Create a new AWS ELB for K8S API
resource "aws_elb" "aws-elb-api" {
name = "kubernetes-elb-${var.aws_cluster_name}"
subnets = ["${var.aws_subnet_ids_public}"]
security_groups = ["${aws_security_group.aws-elb.id}"]
listener {
instance_port = "${var.k8s_secure_api_port}"
instance_protocol = "tcp"
lb_port = "${var.aws_elb_api_port}"
lb_protocol = "tcp"
}
health_check {
healthy_threshold = 2
unhealthy_threshold = 2
timeout = 3
target = "HTTP:8080/"
interval = 30
}
cross_zone_load_balancing = true
idle_timeout = 400
connection_draining = true
connection_draining_timeout = 400
tags {
Name = "kubernetes-${var.aws_cluster_name}-elb-api"
}
}

View File

@@ -0,0 +1,7 @@
output "aws_elb_api_id" {
value = "${aws_elb.aws-elb-api.id}"
}
output "aws_elb_api_fqdn" {
value = "${aws_elb.aws-elb-api.dns_name}"
}

View File

@@ -0,0 +1,28 @@
variable "aws_cluster_name" {
description = "Name of Cluster"
}
variable "aws_vpc_id" {
description = "AWS VPC ID"
}
variable "aws_elb_api_port" {
description = "Port for AWS ELB"
}
variable "k8s_secure_api_port" {
description = "Secure Port of K8S API Server"
}
variable "aws_avail_zones" {
description = "Availability Zones Used"
type = "list"
}
variable "aws_subnet_ids_public" {
description = "IDs of Public Subnets"
type = "list"
}

View File

@@ -0,0 +1,138 @@
#Add AWS Roles for Kubernetes
resource "aws_iam_role" "kube-master" {
name = "kubernetes-${var.aws_cluster_name}-master"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Principal": {
"Service": "ec2.amazonaws.com"
}
}
]
}
EOF
}
resource "aws_iam_role" "kube-worker" {
name = "kubernetes-${var.aws_cluster_name}-node"
assume_role_policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "sts:AssumeRole",
"Principal": {
"Service": "ec2.amazonaws.com"
}
}
]
}
EOF
}
#Add AWS Policies for Kubernetes
resource "aws_iam_role_policy" "kube-master" {
name = "kubernetes-${var.aws_cluster_name}-master"
role = "${aws_iam_role.kube-master.id}"
policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": ["ec2:*"],
"Resource": ["*"]
},
{
"Effect": "Allow",
"Action": ["elasticloadbalancing:*"],
"Resource": ["*"]
},
{
"Effect": "Allow",
"Action": ["route53:*"],
"Resource": ["*"]
},
{
"Effect": "Allow",
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::kubernetes-*"
]
}
]
}
EOF
}
resource "aws_iam_role_policy" "kube-worker" {
name = "kubernetes-${var.aws_cluster_name}-node"
role = "${aws_iam_role.kube-worker.id}"
policy = <<EOF
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": "s3:*",
"Resource": [
"arn:aws:s3:::kubernetes-*"
]
},
{
"Effect": "Allow",
"Action": "ec2:Describe*",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "ec2:AttachVolume",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": "ec2:DetachVolume",
"Resource": "*"
},
{
"Effect": "Allow",
"Action": ["route53:*"],
"Resource": ["*"]
},
{
"Effect": "Allow",
"Action": [
"ecr:GetAuthorizationToken",
"ecr:BatchCheckLayerAvailability",
"ecr:GetDownloadUrlForLayer",
"ecr:GetRepositoryPolicy",
"ecr:DescribeRepositories",
"ecr:ListImages",
"ecr:BatchGetImage"
],
"Resource": "*"
}
]
}
EOF
}
#Create AWS Instance Profiles
resource "aws_iam_instance_profile" "kube-master" {
name = "kube_${var.aws_cluster_name}_master_profile"
roles = ["${aws_iam_role.kube-master.name}"]
}
resource "aws_iam_instance_profile" "kube-worker" {
name = "kube_${var.aws_cluster_name}_node_profile"
roles = ["${aws_iam_role.kube-worker.name}"]
}

View File

@@ -0,0 +1,7 @@
output "kube-master-profile" {
value = "${aws_iam_instance_profile.kube-master.name }"
}
output "kube-worker-profile" {
value = "${aws_iam_instance_profile.kube-worker.name }"
}

View File

@@ -0,0 +1,3 @@
variable "aws_cluster_name" {
description = "Name of Cluster"
}

View File

@@ -0,0 +1,138 @@
resource "aws_vpc" "cluster-vpc" {
cidr_block = "${var.aws_vpc_cidr_block}"
#DNS Related Entries
enable_dns_support = true
enable_dns_hostnames = true
tags {
Name = "kubernetes-${var.aws_cluster_name}-vpc"
}
}
resource "aws_eip" "cluster-nat-eip" {
count = "${length(var.aws_cidr_subnets_public)}"
vpc = true
}
resource "aws_internet_gateway" "cluster-vpc-internetgw" {
vpc_id = "${aws_vpc.cluster-vpc.id}"
tags {
Name = "kubernetes-${var.aws_cluster_name}-internetgw"
}
}
resource "aws_subnet" "cluster-vpc-subnets-public" {
vpc_id = "${aws_vpc.cluster-vpc.id}"
count="${length(var.aws_avail_zones)}"
availability_zone = "${element(var.aws_avail_zones, count.index)}"
cidr_block = "${element(var.aws_cidr_subnets_public, count.index)}"
tags {
Name = "kubernetes-${var.aws_cluster_name}-${element(var.aws_avail_zones, count.index)}-public"
}
}
resource "aws_nat_gateway" "cluster-nat-gateway" {
count = "${length(var.aws_cidr_subnets_public)}"
allocation_id = "${element(aws_eip.cluster-nat-eip.*.id, count.index)}"
subnet_id = "${element(aws_subnet.cluster-vpc-subnets-public.*.id, count.index)}"
}
resource "aws_subnet" "cluster-vpc-subnets-private" {
vpc_id = "${aws_vpc.cluster-vpc.id}"
count="${length(var.aws_avail_zones)}"
availability_zone = "${element(var.aws_avail_zones, count.index)}"
cidr_block = "${element(var.aws_cidr_subnets_private, count.index)}"
tags {
Name = "kubernetes-${var.aws_cluster_name}-${element(var.aws_avail_zones, count.index)}-private"
}
}
#Routing in VPC
#TODO: Do we need two routing tables for each subnet for redundancy or is one enough?
resource "aws_route_table" "kubernetes-public" {
vpc_id = "${aws_vpc.cluster-vpc.id}"
route {
cidr_block = "0.0.0.0/0"
gateway_id = "${aws_internet_gateway.cluster-vpc-internetgw.id}"
}
tags {
Name = "kubernetes-${var.aws_cluster_name}-routetable-public"
}
}
resource "aws_route_table" "kubernetes-private" {
count = "${length(var.aws_cidr_subnets_private)}"
vpc_id = "${aws_vpc.cluster-vpc.id}"
route {
cidr_block = "0.0.0.0/0"
nat_gateway_id = "${element(aws_nat_gateway.cluster-nat-gateway.*.id, count.index)}"
}
tags {
Name = "kubernetes-${var.aws_cluster_name}-routetable-private-${count.index}"
}
}
resource "aws_route_table_association" "kubernetes-public" {
count = "${length(var.aws_cidr_subnets_public)}"
subnet_id = "${element(aws_subnet.cluster-vpc-subnets-public.*.id,count.index)}"
route_table_id = "${aws_route_table.kubernetes-public.id}"
}
resource "aws_route_table_association" "kubernetes-private" {
count = "${length(var.aws_cidr_subnets_private)}"
subnet_id = "${element(aws_subnet.cluster-vpc-subnets-private.*.id,count.index)}"
route_table_id = "${element(aws_route_table.kubernetes-private.*.id,count.index)}"
}
#Kubernetes Security Groups
resource "aws_security_group" "kubernetes" {
name = "kubernetes-${var.aws_cluster_name}-securitygroup"
vpc_id = "${aws_vpc.cluster-vpc.id}"
tags {
Name = "kubernetes-${var.aws_cluster_name}-securitygroup"
}
}
resource "aws_security_group_rule" "allow-all-ingress" {
type = "ingress"
from_port = 0
to_port = 65535
protocol = "-1"
cidr_blocks= ["${var.aws_vpc_cidr_block}"]
security_group_id = "${aws_security_group.kubernetes.id}"
}
resource "aws_security_group_rule" "allow-all-egress" {
type = "egress"
from_port = 0
to_port = 65535
protocol = "-1"
cidr_blocks = ["0.0.0.0/0"]
security_group_id = "${aws_security_group.kubernetes.id}"
}
resource "aws_security_group_rule" "allow-ssh-connections" {
type = "ingress"
from_port = 22
to_port = 22
protocol = "TCP"
cidr_blocks = ["0.0.0.0/0"]
security_group_id = "${aws_security_group.kubernetes.id}"
}

View File

@@ -0,0 +1,16 @@
output "aws_vpc_id" {
value = "${aws_vpc.cluster-vpc.id}"
}
output "aws_subnet_ids_private" {
value = ["${aws_subnet.cluster-vpc-subnets-private.*.id}"]
}
output "aws_subnet_ids_public" {
value = ["${aws_subnet.cluster-vpc-subnets-public.*.id}"]
}
output "aws_security_group" {
value = ["${aws_security_group.kubernetes.*.id}"]
}

View File

@@ -0,0 +1,24 @@
variable "aws_vpc_cidr_block" {
description = "CIDR Blocks for AWS VPC"
}
variable "aws_cluster_name" {
description = "Name of Cluster"
}
variable "aws_avail_zones" {
description = "AWS Availability Zones Used"
type = "list"
}
variable "aws_cidr_subnets_private" {
description = "CIDR Blocks for private subnets in Availability zones"
type = "list"
}
variable "aws_cidr_subnets_public" {
description = "CIDR Blocks for public subnets in Availability zones"
type = "list"
}

View File

@@ -0,0 +1,20 @@
output "bastion_ip" {
value = "${join("\n", aws_instance.bastion-server.*.public_ip)}"
}
output "masters" {
value = "${join("\n", aws_instance.k8s-master.*.private_ip)}"
}
output "workers" {
value = "${join("\n", aws_instance.k8s-worker.*.private_ip)}"
}
output "etcd" {
value = "${join("\n", aws_instance.k8s-etcd.*.private_ip)}"
}
output "aws_elb_api_fqdn" {
value = "${module.aws-elb.aws_elb_api_fqdn}:${var.aws_elb_api_port}"
}

View File

@@ -0,0 +1,27 @@
${connection_strings_master}
${connection_strings_node}
${connection_strings_etcd}
${public_ip_address_bastion}
[kube-master]
${list_master}
[kube-node]
${list_node}
[etcd]
${list_etcd}
[k8s-cluster:children]
kube-node
kube-master
[k8s-cluster:vars]
${elb_api_fqdn}
${elb_api_port}

View File

@@ -1,22 +1,31 @@
deploymentName="test-kube-deploy"
#Global Vars
aws_cluster_name = "devtest"
numControllers="2"
numEtcd="3"
numNodes="2"
#VPC Vars
aws_vpc_cidr_block = "10.250.192.0/18"
aws_cidr_subnets_private = ["10.250.192.0/20","10.250.208.0/20"]
aws_cidr_subnets_public = ["10.250.224.0/20","10.250.240.0/20"]
aws_avail_zones = ["eu-central-1a","eu-central-1b"]
volSizeController="20"
volSizeEtcd="20"
volSizeNodes="20"
#Bastion Host
aws_bastion_ami = "ami-5900cc36"
aws_bastion_size = "t2.small"
awsRegion="us-west-2"
subnet="subnet-xxxxx"
ami="ami-32a85152"
securityGroups="sg-xxxxx"
SSHUser="core"
SSHKey="my-key"
master_instance_type="m3.xlarge"
etcd_instance_type="m3.xlarge"
node_instance_type="m3.xlarge"
#Kubernetes Cluster
terminate_protect="false"
aws_kube_master_num = 3
aws_kube_master_size = "t2.medium"
aws_etcd_num = 3
aws_etcd_size = "t2.medium"
aws_kube_worker_num = 4
aws_kube_worker_size = "t2.medium"
aws_cluster_ami = "ami-903df7ff"
#Settings AWS ELB
aws_elb_api_port = 443
k8s_secure_api_port = 443

View File

@@ -0,0 +1,32 @@
#Global Vars
aws_cluster_name = "devtest"
aws_region = "eu-central-1"
#VPC Vars
aws_vpc_cidr_block = "10.250.192.0/18"
aws_cidr_subnets_private = ["10.250.192.0/20","10.250.208.0/20"]
aws_cidr_subnets_public = ["10.250.224.0/20","10.250.240.0/20"]
aws_avail_zones = ["eu-central-1a","eu-central-1b"]
#Bastion Host
aws_bastion_ami = "ami-5900cc36"
aws_bastion_size = "t2.small"
#Kubernetes Cluster
aws_kube_master_num = 3
aws_kube_master_size = "t2.medium"
aws_etcd_num = 3
aws_etcd_size = "t2.medium"
aws_kube_worker_num = 4
aws_kube_worker_size = "t2.medium"
aws_cluster_ami = "ami-903df7ff"
#Settings AWS ELB
aws_elb_api_port = 443
k8s_secure_api_port = 443

View File

@@ -0,0 +1,97 @@
variable "AWS_ACCESS_KEY_ID" {
description = "AWS Access Key"
}
variable "AWS_SECRET_ACCESS_KEY" {
description = "AWS Secret Key"
}
variable "AWS_SSH_KEY_NAME" {
description = "Name of the SSH keypair to use in AWS."
}
variable "AWS_DEFAULT_REGION" {
description = "AWS Region"
}
//General Cluster Settings
variable "aws_cluster_name" {
description = "Name of AWS Cluster"
}
//AWS VPC Variables
variable "aws_vpc_cidr_block" {
description = "CIDR Block for VPC"
}
variable "aws_avail_zones" {
description = "Availability Zones Used"
type = "list"
}
variable "aws_cidr_subnets_private" {
description = "CIDR Blocks for private subnets in Availability Zones"
type = "list"
}
variable "aws_cidr_subnets_public" {
description = "CIDR Blocks for public subnets in Availability Zones"
type = "list"
}
//AWS EC2 Settings
variable "aws_bastion_ami" {
description = "AMI ID for Bastion Host in chosen AWS Region"
}
variable "aws_bastion_size" {
description = "EC2 Instance Size of Bastion Host"
}
/*
* AWS EC2 Settings
* The number should be divisable by the number of used
* AWS Availability Zones without an remainder.
*/
variable "aws_kube_master_num" {
description = "Number of Kubernetes Master Nodes"
}
variable "aws_kube_master_size" {
description = "Instance size of Kube Master Nodes"
}
variable "aws_etcd_num" {
description = "Number of etcd Nodes"
}
variable "aws_etcd_size" {
description = "Instance size of etcd Nodes"
}
variable "aws_kube_worker_num" {
description = "Number of Kubernetes Worker Nodes"
}
variable "aws_kube_worker_size" {
description = "Instance size of Kubernetes Worker Nodes"
}
variable "aws_cluster_ami" {
description = "AMI ID for Kubernetes Cluster"
}
/*
* AWS ELB Settings
*
*/
variable "aws_elb_api_port" {
description = "Port for AWS ELB"
}
variable "k8s_secure_api_port" {
description = "Secure Port of K8S API Server"
}

View File

@@ -0,0 +1 @@
../../inventory/group_vars

View File

@@ -30,7 +30,7 @@ requirements.
#### OpenStack
Ensure your OpenStack credentials are loaded in environment variables. This can be done by downloading a credentials .rc file from your OpenStack dashboard and sourcing it:
Ensure your OpenStack **Identity v2** credentials are loaded in environment variables. This can be done by downloading a credentials .rc file from your OpenStack dashboard and sourcing it:
```
$ source ~/.stackrc

View File

@@ -0,0 +1 @@
../../../inventory/group_vars

View File

@@ -1,165 +0,0 @@
# Valid bootstrap options (required): ubuntu, coreos, none
bootstrap_os: none
# Directory where the binaries will be installed
bin_dir: /usr/local/bin
# Where the binaries will be downloaded.
# Note: ensure that you've enough disk space (about 1G)
local_release_dir: "/tmp/releases"
# Random shifts for retrying failed ops like pushing/downloading
retry_stagger: 5
# Uncomment this line for Container Linux by CoreOS only.
# Directory where python binary is installed
# ansible_python_interpreter: "/opt/bin/python"
# This is the group that the cert creation scripts chgrp the
# cert files to. Not really changable...
kube_cert_group: kube-cert
# Cluster Loglevel configuration
kube_log_level: 2
# Users to create for basic auth in Kubernetes API via HTTP
kube_api_pwd: "changeme"
kube_users:
kube:
pass: "{{kube_api_pwd}}"
role: admin
root:
pass: "changeme"
role: admin
# Kubernetes cluster name, also will be used as DNS domain
cluster_name: cluster.local
# Subdomains of DNS domain to be resolved via /etc/resolv.conf
ndots: 5
# Deploy netchecker app to verify DNS resolve as an HTTP service
deploy_netchecker: false
# For some environments, each node has a pubilcally accessible
# address and an address it should bind services to. These are
# really inventory level variables, but described here for consistency.
#
# When advertising access, the access_ip will be used, but will defer to
# ip and then the default ansible ip when unspecified.
#
# When binding to restrict access, the ip variable will be used, but will
# defer to the default ansible ip when unspecified.
#
# The ip variable is used for specific address binding, e.g. listen address
# for etcd. This is use to help with environments like Vagrant or multi-nic
# systems where one address should be preferred over another.
# ip: 10.2.2.2
#
# The access_ip variable is used to define how other nodes should access
# the node. This is used in flannel to allow other flannel nodes to see
# this node for example. The access_ip is really useful AWS and Google
# environments where the nodes are accessed remotely by the "public" ip,
# but don't know about that address themselves.
# access_ip: 1.1.1.1
# Etcd access modes:
# Enable multiaccess to configure clients to access all of the etcd members directly
# as the "http://hostX:port, http://hostY:port, ..." and ignore the proxy loadbalancers.
# This may be the case if clients support and loadbalance multiple etcd servers natively.
etcd_multiaccess: true
# Assume there are no internal loadbalancers for apiservers exist and listen on
# kube_apiserver_port (default 443)
loadbalancer_apiserver_localhost: true
# Choose network plugin (calico, weave or flannel)
kube_network_plugin: flannel
# Kubernetes internal network for services, unused block of space.
kube_service_addresses: 10.233.0.0/18
# internal network. When used, it will assign IP
# addresses from this range to individual pods.
# This network must be unused in your network infrastructure!
kube_pods_subnet: 10.233.64.0/18
# internal network total size (optional). This is the prefix of the
# entire network. Must be unused in your environment.
# kube_network_prefix: 18
# internal network node size allocation (optional). This is the size allocated
# to each node on your network. With these defaults you should have
# room for 4096 nodes with 254 pods per node.
kube_network_node_prefix: 24
# With calico it is possible to distributed routes with border routers of the datacenter.
peer_with_router: false
# Warning : enabling router peering will disable calico's default behavior ('node mesh').
# The subnets of each nodes will be distributed by the datacenter router
# The port the API Server will be listening on.
kube_apiserver_ip: "{{ kube_service_addresses|ipaddr('net')|ipaddr(1)|ipaddr('address') }}"
kube_apiserver_port: 443 # (https)
kube_apiserver_insecure_port: 8080 # (http)
# Internal DNS configuration.
# Kubernetes can create and mainatain its own DNS server to resolve service names
# into appropriate IP addresses. It's highly advisable to run such DNS server,
# as it greatly simplifies configuration of your applications - you can use
# service names instead of magic environment variables.
# Can be dnsmasq_kubedns, kubedns or none
dns_mode: dnsmasq_kubedns
# Can be docker_dns, host_resolvconf or none
resolvconf_mode: docker_dns
## Upstream dns servers used by dnsmasq
#upstream_dns_servers:
# - 8.8.8.8
# - 8.8.4.4
dns_domain: "{{ cluster_name }}"
# Ip address of the kubernetes skydns service
skydns_server: "{{ kube_service_addresses|ipaddr('net')|ipaddr(3)|ipaddr('address') }}"
dns_server: "{{ kube_service_addresses|ipaddr('net')|ipaddr(2)|ipaddr('address') }}"
# There are some changes specific to the cloud providers
# for instance we need to encapsulate packets with some network plugins
# If set the possible values are either 'gce', 'aws', 'azure' or 'openstack'
# When openstack is used make sure to source in the openstack credentials
# like you would do when using nova-client before starting the playbook.
# When azure is used, you need to also set the following variables.
# cloud_provider:
# see docs/azure.md for details on how to get these values
#azure_tenant_id:
#azure_subscription_id:
#azure_aad_client_id:
#azure_aad_client_secret:
#azure_resource_group:
#azure_location:
#azure_subnet_name:
#azure_security_group_name:
#azure_vnet_name:
## Set these proxy values in order to update docker daemon to use proxies
# http_proxy: ""
# https_proxy: ""
# no_proxy: ""
# Path used to store Docker data
docker_daemon_graph: "/var/lib/docker"
## A string of extra options to pass to the docker daemon.
## This string should be exactly as you wish it to appear.
## An obvious use case is allowing insecure-registry access
## to self hosted registries like so:
docker_options: "--insecure-registry={{ kube_service_addresses }} --graph={{ docker_daemon_graph }}"
# K8s image pull policy (imagePullPolicy)
k8s_image_pull_policy: IfNotPresent
# default packages to install within the cluster
kpm_packages: []
# - name: kube-system/grafana

View File

@@ -68,7 +68,7 @@ resource "openstack_compute_instance_v2" "k8s_master" {
floating_ip = "${element(openstack_networking_floatingip_v2.k8s_master.*.address, count.index)}"
metadata = {
ssh_user = "${var.ssh_user}"
kubespray_groups = "etcd,kube-master,kube-node,k8s-cluster"
kubespray_groups = "etcd,kube-master,kube-node,k8s-cluster,vault"
}
}
@@ -87,10 +87,10 @@ resource "openstack_compute_instance_v2" "k8s_master_no_floating_ip" {
"${openstack_compute_secgroup_v2.k8s.name}" ]
metadata = {
ssh_user = "${var.ssh_user}"
kubespray_groups = "etcd,kube-master,kube-node,k8s-cluster"
kubespray_groups = "etcd,kube-master,kube-node,k8s-cluster,vault,no-floating"
}
provisioner "local-exec" {
command = "sed s/USER/${var.ssh_user}/ contrib/terraform/openstack/ansible_bastion_template.txt | sed s/BASTION_ADDRESS/${element(openstack_networking_floatingip_v2.k8s_master.*.address, 0)}/ > contrib/terraform/openstack/group_vars/k8s-cluster.yml"
command = "sed s/USER/${var.ssh_user}/ contrib/terraform/openstack/ansible_bastion_template.txt | sed s/BASTION_ADDRESS/${element(openstack_networking_floatingip_v2.k8s_master.*.address, 0)}/ > contrib/terraform/openstack/group_vars/no-floating.yml"
}
}
@@ -107,7 +107,7 @@ resource "openstack_compute_instance_v2" "k8s_node" {
floating_ip = "${element(openstack_networking_floatingip_v2.k8s_node.*.address, count.index)}"
metadata = {
ssh_user = "${var.ssh_user}"
kubespray_groups = "kube-node,k8s-cluster"
kubespray_groups = "kube-node,k8s-cluster,vault"
}
}
@@ -123,10 +123,10 @@ resource "openstack_compute_instance_v2" "k8s_node_no_floating_ip" {
security_groups = ["${openstack_compute_secgroup_v2.k8s.name}" ]
metadata = {
ssh_user = "${var.ssh_user}"
kubespray_groups = "kube-node,k8s-cluster"
kubespray_groups = "kube-node,k8s-cluster,vault,no-floating"
}
provisioner "local-exec" {
command = "sed s/USER/${var.ssh_user}/ contrib/terraform/openstack/ansible_bastion_template.txt | sed s/BASTION_ADDRESS/${element(openstack_networking_floatingip_v2.k8s_master.*.address, 0)}/ > contrib/terraform/openstack/group_vars/k8s-cluster.yml"
command = "sed s/USER/${var.ssh_user}/ contrib/terraform/openstack/ansible_bastion_template.txt | sed s/BASTION_ADDRESS/${element(openstack_networking_floatingip_v2.k8s_master.*.address, 0)}/ > contrib/terraform/openstack/group_vars/no-floating.yml"
}
}

View File

@@ -8,20 +8,39 @@ The inventory is composed of 3 groups:
* **kube-node** : list of kubernetes nodes where the pods will run.
* **kube-master** : list of servers where kubernetes master components (apiserver, scheduler, controller) will run.
Note: if you want the server to act both as master and node the server must be defined on both groups _kube-master_ and _kube-node_
* **etcd**: list of server to compose the etcd server. you should have at least 3 servers for failover purposes.
* **etcd**: list of servers to compose the etcd server. You should have at least 3 servers for failover purpose.
Note: do not modify the children of _k8s-cluster_, like putting
the _etcd_ group into the _k8s-cluster_, unless you are certain
to do that and you have it fully contained in the latter:
```
k8s-cluster ⊂ etcd => kube-node ∩ etcd = etcd
```
When _kube-node_ contains _etcd_, you define your etcd cluster to be as well schedulable for Kubernetes workloads.
If you want it a standalone, make sure those groups do not intersect.
If you want the server to act both as master and node, the server must be defined
on both groups _kube-master_ and _kube-node_. If you want a standalone and
unschedulable master, the server must be defined only in the _kube-master_ and
not _kube-node_.
There are also two special groups:
* **calico-rr** : explained for [advanced Calico networking cases](docs/calico.md)
* **bastion** : configure a bastion host if your nodes are not directly reachable
Below is a complete inventory example:
```
## Configure 'ip' variable to bind kubernetes services on a
## different ip than the default iface
node1 ansible_ssh_host=95.54.0.12 # ip=10.3.0.1
node2 ansible_ssh_host=95.54.0.13 # ip=10.3.0.2
node3 ansible_ssh_host=95.54.0.14 # ip=10.3.0.3
node4 ansible_ssh_host=95.54.0.15 # ip=10.3.0.4
node5 ansible_ssh_host=95.54.0.16 # ip=10.3.0.5
node6 ansible_ssh_host=95.54.0.17 # ip=10.3.0.6
node1 ansible_ssh_host=95.54.0.12 ip=10.3.0.1
node2 ansible_ssh_host=95.54.0.13 ip=10.3.0.2
node3 ansible_ssh_host=95.54.0.14 ip=10.3.0.3
node4 ansible_ssh_host=95.54.0.15 ip=10.3.0.4
node5 ansible_ssh_host=95.54.0.16 ip=10.3.0.5
node6 ansible_ssh_host=95.54.0.17 ip=10.3.0.6
[kube-master]
node1
@@ -42,14 +61,15 @@ node6
[k8s-cluster:children]
kube-node
kube-master
etcd
```
Group vars and overriding variables precedence
----------------------------------------------
The group variables to control main deployment options are located in the directory ``inventory/group_vars``.
Optional variables are located in the `inventory/group_vars/all.yml`.
Mandatory variables that are common for at least one role (or a node group) can be found in the
`inventory/group_vars/k8s-cluster.yml`.
There are also role vars for docker, rkt, kubernetes preinstall and master roles.
According to the [ansible docs](http://docs.ansible.com/ansible/playbooks_variables.html#variable-precedence-where-should-i-put-a-variable),
those cannot be overriden from the group vars. In order to override, one should use
@@ -142,7 +162,7 @@ ansible-playbook -i inventory/inventory.ini -e dns_server='' cluster.yml --tags
And this prepares all container images localy (at the ansible runner node) without installing
or upgrading related stuff or trying to upload container to K8s cluster nodes:
```
ansible-playbook -i inventory/inventory.ini cluster.yaml \
ansible-playbook -i inventory/inventory.ini cluster.yml \
-e download_run_once=true -e download_localhost=true \
--tags download --skip-tags upload,upgrade
```

22
docs/atomic.md Normal file
View File

@@ -0,0 +1,22 @@
Atomic host bootstrap
=====================
Atomic host testing has been done with the network plugin flannel. Change the inventory var `kube_network_plugin: flannel`.
Note: Flannel is the only plugin that has currently been tested with atomic
### Vagrant
* For bootstrapping with Vagrant, use box centos/atomic-host
* Update VagrantFile variable `local_release_dir` to `/var/vagrant/temp`.
* Update `vm_memory = 2048` and `vm_cpus = 2`
* Networking on vagrant hosts has to be brought up manually once they are booted.
```
vagrant ssh
sudo /sbin/ifup enp0s8
```
* For users of vagrant-libvirt download qcow2 format from https://wiki.centos.org/SpecialInterestGroup/Atomic/Download/
Then you can proceed to [cluster deployment](#run-deployment)

View File

@@ -3,7 +3,7 @@ AWS
To deploy kubespray on [AWS](https://aws.amazon.com/) uncomment the `cloud_provider` option in `group_vars/all.yml` and set it to `'aws'`.
Prior to creating your instances, you **must** ensure that you have created IAM roles and policies for both "kubernetes-master" and "kubernetes-node". You can find the IAM policies [here](https://github.com/kubernetes/kubernetes/tree/master/cluster/aws/templates/iam). See the [IAM Documentation](https://aws.amazon.com/documentation/iam/) if guidance is needed on how to set these up. When you bring your instances online, associate them with the respective IAM role. Nodes that are only to be used for Etcd do not need a role.
Prior to creating your instances, you **must** ensure that you have created IAM roles and policies for both "kubernetes-master" and "kubernetes-node". You can find the IAM policies [here](https://github.com/kubernetes-incubator/kargo/tree/master/contrib/aws_iam/). See the [IAM Documentation](https://aws.amazon.com/documentation/iam/) if guidance is needed on how to set these up. When you bring your instances online, associate them with the respective IAM role. Nodes that are only to be used for Etcd do not need a role.
The next step is to make sure the hostnames in your `inventory` file are identical to your internal hostnames in AWS. This may look something like `ip-111-222-333-444.us-west-2.compute.internal`. You can then specify how Ansible connects to these instances with `ansible_ssh_host` and `ansible_ssh_user`.

View File

@@ -1,7 +1,7 @@
Azure
===============
To deploy kubespray on [Azure](https://azure.microsoft.com) uncomment the `cloud_provider` option in `group_vars/all.yml` and set it to `'azure'`.
To deploy Kubernetes on [Azure](https://azure.microsoft.com) uncomment the `cloud_provider` option in `group_vars/all.yml` and set it to `'azure'`.
All your instances are required to run in a resource group and a routing table has to be attached to the subnet your instances are in.
@@ -49,8 +49,8 @@ This is the AppId from the last command
- Create the role assignment with:
`azure role assignment create --spn http://kubernetes -o "Owner" -c /subscriptions/SUBSCRIPTION_ID`
azure\_aad\_client\_id musst be set to the AppId, azure\_aad\_client\_secret is your choosen secret.
azure\_aad\_client\_id must be set to the AppId, azure\_aad\_client\_secret is your choosen secret.
## Provisioning Azure with Resource Group Templates
You'll find Resource Group Templates and scripts to provision the required infrastructore to Azure in [*contrib/azurerm*](../contrib/azurerm/README.md)
You'll find Resource Group Templates and scripts to provision the required infrastructure to Azure in [*contrib/azurerm*](../contrib/azurerm/README.md)

View File

@@ -147,6 +147,16 @@ The inventory above will deploy the following topology assuming that calico's
![Image](figures/kargo-calico-rr.png?raw=true)
##### Optional : Define default endpoint to host action
By default Calico blocks traffic from endpoints to the host itself by using an iptables DROP action. When using it in kubernetes the action has to be changed to RETURN (default in kargo) or ACCEPT (see https://github.com/projectcalico/felix/issues/660 and https://github.com/projectcalico/calicoctl/issues/1389). Otherwise all network packets from pods (with hostNetwork=False) to services endpoints (with hostNetwork=True) withing the same node are dropped.
To re-define default action please set the following variable in your inventory:
```
calico_endpoint_to_host_action: "ACCEPT"
```
Cloud providers configuration
=============================

View File

@@ -13,12 +13,4 @@ Before running the cluster playbook you must satisfy the following requirements:
* On each CoreOS nodes a writable directory **/opt/bin** (~400M disk space)
* Uncomment the variable **ansible\_python\_interpreter** in the file `inventory/group_vars/all.yml`
* run the Python bootstrap playbook
```
ansible-playbook -u smana -e ansible_ssh_user=smana -b --become-user=root -i inventory/inventory.cfg coreos-bootstrap.yml
```
Then you can proceed to [cluster deployment](#run-deployment)

View File

@@ -1,10 +1,10 @@
Getting started
===============
The easiest way to run the deployement is to use the **kargo-cli** tool.
The easiest way to run the deployement is to use the **kargo-cli** tool.
A complete documentation can be found in its [github repository](https://github.com/kubespray/kargo-cli).
Here is a simple example on AWS:
Here is a simple example on AWS:
* Create instances and generate the inventory
@@ -12,21 +12,46 @@ Here is a simple example on AWS:
kargo aws --instances 3
```
* Run the deployment
* Run the deployment
```
kargo deploy --aws -u centos -n calico
```
Building your own inventory
^^^^^^^^^^^^^^^^^^^^^^^^^^^
---------------------------
Ansible inventory can be stored in 3 formats: YAML, JSON, or inifile. There is
Ansible inventory can be stored in 3 formats: YAML, JSON, or INI-like. There is
an example inventory located
[here](https://github.com/kubernetes-incubator/kargo/blob/master/inventory/inventory.example).
You can use an
[inventory generator](https://github.com/kubernetes-incubator/kargo/blob/master/contrib/inventory_generator/inventory_generator.py)
[inventory generator](https://github.com/kubernetes-incubator/kargo/blob/master/contrib/inventory_builder/inventory.py)
to create or modify an Ansible inventory. Currently, it is limited in
functionality and is only use for making a basic Kargo cluster, but it does
support creating large clusters.
support creating large clusters. It now supports
separated ETCD and Kubernetes master roles from node role if the size exceeds a
certain threshold. Run inventory.py help for more information.
Example inventory generator usage:
```
cp -r inventory my_inventory
declare -a IPS=(10.10.1.3 10.10.1.4 10.10.1.5)
CONFIG_FILE=my_inventory/inventory.cfg python3 contrib/inventory_builder/inventory.py ${IPS}
```
Starting custom deployment
--------------------------
Once you have an inventory, you may want to customize deployment data vars
and start the deployment:
**IMPORTANT: Edit my_inventory/groups_vars/*.yaml to override data vars**
```
ansible-playbook -i my_inventory/inventory.cfg cluster.yml -b -v \
--private-key=~/.ssh/private_key
```
See more details in the [ansible guide](ansible.md).

View File

@@ -11,18 +11,11 @@ achieve the same goal.
Etcd
----
Etcd proxies are deployed on each node in the `k8s-cluster` group. A proxy is
a separate etcd process. It has a `localhost:2379` frontend and all of the etcd
cluster members as backends. Note that the `access_ip` is used as the backend
IP, if specified. Frontend endpoints cannot be accessed externally as they are
bound to a localhost only.
The `etcd_access_endpoint` fact provides an access pattern for clients. And the
`etcd_multiaccess` (defaults to `false`) group var controlls that behavior.
When enabled, it makes deployed components to access the etcd cluster members
`etcd_multiaccess` (defaults to `True`) group var controlls that behavior.
It makes deployed components to access the etcd cluster members
directly: `http://ip1:2379, http://ip2:2379,...`. This mode assumes the clients
do a loadbalancing and handle HA for connections. Note, a pod definition of a
flannel networking plugin always uses a single `--etcd-server` endpoint!
do a loadbalancing and handle HA for connections.
Kube-apiserver
@@ -33,15 +26,20 @@ proxy. Kargo includes support for an nginx-based proxy that resides on each
non-master Kubernetes node. This is referred to as localhost loadbalancing. It
is less efficient than a dedicated load balancer because it creates extra
health checks on the Kubernetes apiserver, but is more practical for scenarios
where an external LB or virtual IP management is inconvenient.
where an external LB or virtual IP management is inconvenient. This option is
configured by the variable `loadbalancer_apiserver_localhost` (defaults to `True`).
You may also define the port the local internal loadbalancer users by changing,
`nginx_kube_apiserver_port`. This defaults to the value of `kube_apiserver_port`.
It is also import to note that Kargo will only configure kubelet and kube-proxy
on non-master nodes to use the local internal loadbalancer.
This option is configured by the variable `loadbalancer_apiserver_localhost`.
you will need to configure your own loadbalancer to achieve HA. Note that
deploying a loadbalancer is up to a user and is not covered by ansible roles
in Kargo. By default, it only configures a non-HA endpoint, which points to
the `access_ip` or IP address of the first server node in the `kube-master`
group. It can also configure clients to use endpoints for a given loadbalancer
type. The following diagram shows how traffic to the apiserver is directed.
If you choose to NOT use the local internal loadbalancer, you will need to configure
your own loadbalancer to achieve HA. Note that deploying a loadbalancer is up to
a user and is not covered by ansible roles in Kargo. By default, it only configures
a non-HA endpoint, which points to the `access_ip` or IP address of the first server
node in the `kube-master` group. It can also configure clients to use endpoints
for a given loadbalancer type. The following diagram shows how traffic to the
apiserver is directed.
![Image](figures/loadbalancer_localhost.png?raw=true)
@@ -63,8 +61,8 @@ listen kubernetes-apiserver-https
mode tcp
timeout client 3h
timeout server 3h
server master1 <IP1>:443
server master2 <IP2>:443
server master1 <IP1>:6443
server master2 <IP2>:6443
balance roundrobin
```
@@ -90,16 +88,18 @@ Access endpoints are evaluated automagically, as the following:
| Endpoint type | kube-master | non-master |
|------------------------------|---------------|---------------------|
| Local LB | http://lc:p | https://lc:sp |
| Local LB (default) | http://lc:p | https://lc:nsp |
| External LB, no internal | https://lb:lp | https://lb:lp |
| No ext/int LB (default) | http://lc:p | https://m[0].aip:sp |
| No ext/int LB | http://lc:p | https://m[0].aip:sp |
Where:
* `m[0]` - the first node in the `kube-master` group;
* `lb` - LB FQDN, `apiserver_loadbalancer_domain_name`;
* `lc` - localhost;
* `p` - insecure port, `kube_apiserver_insecure_port`
* `nsp` - nginx secure port, `nginx_kube_apiserver_port`;
* `sp` - secure port, `kube_apiserver_port`;
* `lp` - LB port, `loadbalancer_apiserver.port`, defers to the secure port;
* `ip` - the node IP, defers to the ansible IP;
* `aip` - `access_ip`, defers to the ip.

View File

@@ -0,0 +1,103 @@
# Overview
Distributed system such as Kubernetes are designed to be resilient to the
failures. More details about Kubernetes High-Availability (HA) may be found at
[Building High-Availability Clusters](https://kubernetes.io/docs/admin/high-availability/)
To have a simple view the most of parts of HA will be skipped to describe
Kubelet<->Controller Manager communication only.
By default the normal behavior looks like:
1. Kubelet updates it status to apiserver periodically, as specified by
`--node-status-update-frequency`. The default value is **10s**.
2. Kubernetes controller manager checks the statuses of Kubelets every
`-node-monitor-period`. The default value is **5s**.
3. In case the status is updated within `--node-monitor-grace-period` of time,
Kubernetes controller manager considers healthy status of Kubelet. The
default value is **40s**.
> Kubernetes controller manager and Kubelets work asynchronously. It means that
> the delay may include any network latency, API Server latency, etcd latency,
> latency caused by load on one's master nodes and so on. So if
> `--node-status-update-frequency` is set to 5s in reality it may appear in
> etcd in 6-7 seconds or even longer when etcd cannot commit data to quorum
> nodes.
# Failure
Kubelet will try to make `nodeStatusUpdateRetry` post attempts. Currently
`nodeStatusUpdateRetry` is constantly set to 5 in
[kubelet.go](https://github.com/kubernetes/kubernetes/blob/release-1.5/pkg/kubelet/kubelet.go#L102).
Kubelet will try to update the status in
[tryUpdateNodeStatus](https://github.com/kubernetes/kubernetes/blob/release-1.5/pkg/kubelet/kubelet_node_status.go#L345)
function. Kubelet uses `http.Client()` Golang method, but has no specified
timeout. Thus there may be some glitches when API Server is overloaded while
TCP connection is established.
So, there will be `nodeStatusUpdateRetry` * `--node-status-update-frequency`
attempts to set a status of node.
At the same time Kubernetes controller manager will try to check
`nodeStatusUpdateRetry` times every `--node-monitor-period` of time. After
`--node-monitor-grace-period` it will consider node unhealthy. It will remove
its pods based on `--pod-eviction-timeout`
Kube proxy has a watcher over API. Once pods are evicted, Kube proxy will
notice and will update iptables of the node. It will remove endpoints from
services so pods from failed node won't be accessible anymore.
# Recommendations for different cases
## Fast Update and Fast Reaction
If `-node-status-update-frequency` is set to **4s** (10s is default).
`--node-monitor-period` to **2s** (5s is default).
`--node-monitor-grace-period` to **20s** (40s is default).
`--pod-eviction-timeout` is set to **30s** (5m is default)
In such scenario, pods will be evicted in **50s** because the node will be
considered as down after **20s**, and `--pod-eviction-timeout` occurs after
**30s** more. However, this scenario creates an overhead on etcd as every node
will try to update its status every 2 seconds.
If the environment has 1000 nodes, there will be 15000 node updates per
minute which may require large etcd containers or even dedicated nodes for etcd.
> If we calculate the number of tries, the division will give 5, but in reality
> it will be from 3 to 5 with `nodeStatusUpdateRetry` attempts of each try. The
> total number of attemtps will vary from 15 to 25 due to latency of all
> components.
## Medium Update and Average Reaction
Let's set `-node-status-update-frequency` to **20s**
`--node-monitor-grace-period` to **2m** and `--pod-eviction-timeout` to **1m**.
In that case, Kubelet will try to update status every 20s. So, it will be 6 * 5
= 30 attempts before Kubernetes controller manager will consider unhealthy
status of node. After 1m it will evict all pods. The total time will be 3m
before eviction process.
Such scenario is good for medium environments as 1000 nodes will require 3000
etcd updates per minute.
> In reality, there will be from 4 to 6 node update tries. The total number of
> of attempts will vary from 20 to 30.
## Low Update and Slow reaction
Let's set `-node-status-update-frequency` to **1m**.
`--node-monitor-grace-period` will set to **5m** and `--pod-eviction-timeout`
to **1m**. In this scenario, every kubelet will try to update the status every
minute. There will be 5 * 5 = 25 attempts before unhealty status. After 5m,
Kubernetes controller manager will set unhealthy status. This means that pods
will be evicted after 1m after being marked unhealthy. (6m in total).
> In reality, there will be from 3 to 5 tries. The total number of attempt will
> vary from 15 to 25.
There can be different combinations such as Fast Update with Slow reaction to
satisfy specific cases.

View File

@@ -3,7 +3,8 @@ Large deployments of K8s
For a large scaled deployments, consider the following configuration changes:
* Tune [ansible settings](http://docs.ansible.com/ansible/intro_configuration.html)
* Tune [ansible settings]
(http://docs.ansible.com/ansible/intro_configuration.html)
for `forks` and `timeout` vars to fit large numbers of nodes being deployed.
* Override containers' `foo_image_repo` vars to point to intranet registry.
@@ -23,9 +24,25 @@ For a large scaled deployments, consider the following configuration changes:
* Tune CPU/memory limits and requests. Those are located in roles' defaults
and named like ``foo_memory_limit``, ``foo_memory_requests`` and
``foo_cpu_limit``, ``foo_cpu_requests``. Note that 'Mi' memory units for K8s
will be submitted as 'M', if applied for ``docker run``, and cpu K8s units will
end up with the 'm' skipped for docker as well. This is required as docker does not
understand k8s units well.
will be submitted as 'M', if applied for ``docker run``, and cpu K8s units
will end up with the 'm' skipped for docker as well. This is required as
docker does not understand k8s units well.
* Tune ``kubelet_status_update_frequency`` to increase reliability of kubelet.
``kube_controller_node_monitor_grace_period``,
``kube_controller_node_monitor_period``,
``kube_controller_pod_eviction_timeout`` for better Kubernetes reliability.
Check out [Kubernetes Reliability](kubernetes-reliability.md)
* Add calico-rr nodes if you are deploying with Calico or Canal. Nodes recover
from host/network interruption much quicker with calico-rr. Note that
calico-rr role must be on a host without kube-master or kube-node role (but
etcd role is okay).
* Check out the
[Inventory](getting-started.md#building-your-own-inventory)
section of the Getting started guide for tips on creating a large scale
Ansible inventory.
For example, when deploying 200 nodes, you may want to run ansible with
``--forks=50``, ``--timeout=600`` and define the ``retry_stagger: 60``.

View File

@@ -2,7 +2,7 @@ Network Checker Application
===========================
With the ``deploy_netchecker`` var enabled (defaults to false), Kargo deploys a
Network Checker Application from the 3rd side `l23network/mcp-netchecker` docker
Network Checker Application from the 3rd side `l23network/k8s-netchecker` docker
images. It consists of the server and agents trying to reach the server by usual
for Kubernetes applications network connectivity meanings. Therefore, this
automagically verifies a pod to pod connectivity via the cluster IP and checks
@@ -25,8 +25,8 @@ There are related application specifc variables:
netchecker_port: 31081
agent_report_interval: 15
netcheck_namespace: default
agent_img: "quay.io/l23network/mcp-netchecker-agent:v0.1"
server_img: "quay.io/l23network/mcp-netchecker-server:v0.1"
agent_img: "quay.io/l23network/k8s-netchecker-agent:v1.0"
server_img: "quay.io/l23network/k8s-netchecker-server:v1.0"
```
Note that the application verifies DNS resolve for FQDNs comprising only the

View File

@@ -5,7 +5,7 @@ Kargo's roadmap
- Propose kubeadm as an option in order to setup the kubernetes cluster.
That would probably improve deployment speed and certs management [#553](https://github.com/kubespray/kargo/issues/553)
### Self deployment (pull-mode) [#320](https://github.com/kubespray/kargo/issues/320)
### Self deployment (pull-mode) [#320](https://github.com/kubespray/kargo/issues/320)
- the playbook would install and configure docker/rkt and the etcd cluster
- the following data would be inserted into etcd: certs,tokens,users,inventory,group_vars.
- a "kubespray" container would be deployed (kargo-cli, ansible-playbook, kpm)
@@ -13,42 +13,43 @@ That would probably improve deployment speed and certs management [#553](https:/
- **self deployment** of the node from inside a container [#321](https://github.com/kubespray/kargo/issues/321)
### Provisionning and cloud providers
- Terraform to provision instances on **GCE, AWS, Openstack, Digital Ocean, Azure**
- On AWS autoscaling, multi AZ
- On Azure autoscaling, create loadbalancer [#297](https://github.com/kubespray/kargo/issues/297)
- On GCE be able to create a loadbalancer automatically (IAM ?) [#280](https://github.com/kubespray/kargo/issues/280)
- **TLS boostrap** support for kubelet [#234](https://github.com/kubespray/kargo/issues/234)
(related issues: https://github.com/kubernetes/kubernetes/pull/20439 <br>
- [ ] Terraform to provision instances on **GCE, AWS, Openstack, Digital Ocean, Azure**
- [ ] On AWS autoscaling, multi AZ
- [ ] On Azure autoscaling, create loadbalancer [#297](https://github.com/kubespray/kargo/issues/297)
- [ ] On GCE be able to create a loadbalancer automatically (IAM ?) [#280](https://github.com/kubespray/kargo/issues/280)
- [x] **TLS boostrap** support for kubelet [#234](https://github.com/kubespray/kargo/issues/234)
(related issues: https://github.com/kubernetes/kubernetes/pull/20439 <br>
https://github.com/kubernetes/kubernetes/issues/18112)
### Tests
- Run kubernetes e2e tests
- migrate to jenkins
- [x] Run kubernetes e2e tests
- [x] migrate to jenkins
(a test is currently a deployment on a 3 node cluste, testing k8s api, ping between 2 pods)
- Full tests on GCE per day (All OS's, all network plugins)
- trigger a single test per pull request
- single test with the Ansible version n-1 per day
- Test idempotency on on single OS but for all network plugins/container engines
- single test on AWS per day
- test different achitectures :
- [x] Full tests on GCE per day (All OS's, all network plugins)
- [x] trigger a single test per pull request
- [ ] ~~single test with the Ansible version n-1 per day~~
- [x] Test idempotency on on single OS but for all network plugins/container engines
- [ ] single test on AWS per day
- [x] test different achitectures :
- 3 instances, 3 are members of the etcd cluster, 2 of them acting as master and node, 1 as node
- 5 instances, 3 are etcd and nodes, 2 are masters only
- 7 instances, 3 etcd only, 2 masters, 2 nodes
- test scale up cluster: +1 etcd, +1 master, +1 node
- [ ] test scale up cluster: +1 etcd, +1 master, +1 node
### Lifecycle
- Adopt the kubeadm tool by delegating CM tasks it is capable to accomplish well [#553](https://github.com/kubespray/kargo/issues/553)
- Drain worker node when upgrading k8s components in a worker node. [#154](https://github.com/kubespray/kargo/issues/154)
- Drain worker node when shutting down/deleting an instance
- [ ] Adopt the kubeadm tool by delegating CM tasks it is capable to accomplish well [#553](https://github.com/kubespray/kargo/issues/553)
- [x] Drain worker node when upgrading k8s components in a worker node. [#154](https://github.com/kubespray/kargo/issues/154)
- [ ] Drain worker node when shutting down/deleting an instance
- [ ] Upgrade granularity: select components to upgrade and skip others
### Networking
- romana.io support [#160](https://github.com/kubespray/kargo/issues/160)
- Configure network policy for Calico. [#159](https://github.com/kubespray/kargo/issues/159)
- Opencontrail
- Canal
- Cloud Provider native networking (instead of our network plugins)
- [ ] romana.io support [#160](https://github.com/kubespray/kargo/issues/160)
- [ ] Configure network policy for Calico. [#159](https://github.com/kubespray/kargo/issues/159)
- [ ] Opencontrail
- [x] Canal
- [x] Cloud Provider native networking (instead of our network plugins)
### High availability
### High availability
- (to be discussed) option to set a loadbalancer for the apiservers like ucarp/packemaker/keepalived
While waiting for the issue [kubernetes/kubernetes#18174](https://github.com/kubernetes/kubernetes/issues/18174) to be fixed.
@@ -72,7 +73,7 @@ Include optionals deployments to init the cluster:
##### Others
##### Dashboards:
##### Dashboards:
- kubernetes-dashboard
- Fabric8
- Tectonic

View File

@@ -4,25 +4,40 @@ Travis CI test matrix
GCE instances
-------------
Here is the test matrix for the Travis CI gates:
Here is the test matrix for the CI gates:
| Network plugin| OS type| GCE region| Nodes layout|
|-------------------------|-------------------------|-------------------------|-------------------------|
| canal| debian-8-kubespray| asia-east1-a| ha|
| canal| debian-8-kubespray| asia-east1-a| ha-scale|
| calico| debian-8-kubespray| europe-west1-c| default|
| flannel| centos-7| asia-northeast1-c| default|
| calico| centos-7| us-central1-b| ha|
| weave| rhel-7| us-east1-c| default|
| canal| coreos-stable| us-west1-b| default|
| canal| coreos-stable| us-west1-b| ha-scale|
| canal| rhel-7| asia-northeast1-b| separate|
| weave| ubuntu-1604-xenial| europe-west1-d| separate|
| calico| coreos-stable| us-central1-f| separate|
Where the nodes layout `default` is a non-HA two nodes setup with the separate `kube-node`
and the `etcd` group merged with the `kube-master`. The `separate` layout is when
there is only node of each type, which is a kube master, compute and etcd cluster member.
And the `ha` layout stands for a two etcd nodes, two masters and a single worker node,
partially intersecting though.
Node Layouts
------------
There are four node layout types: `default`, `separate`, `ha`, and `scale`.
`default` is a non-HA two nodes setup with one separate `kube-node`
and the `etcd` group merged with the `kube-master`.
`separate` layout is when there is only node of each type, which includes
a kube-master, kube-node, and etcd cluster member.
`ha` layout consists of two etcd nodes, two masters and a single worker node,
with role intersection.
`scale` layout can be combined with above layouts. It includes 200 fake hosts
in the Ansible inventory. This helps test TLS certificate generation at scale
to prevent regressions and profile certain long-running tasks. These nodes are
never actually deployed, but certificates are generated for them.
Note, the canal network plugin deploys flannel as well plus calico policy controller.
@@ -40,15 +55,15 @@ GCE instances
| Stage| Network plugin| OS type| GCE region| Nodes layout
|--------------------|--------------------|--------------------|--------------------|--------------------|
| part1| calico| coreos-stable| us-west1-b| separated|
| part1| calico| coreos-stable| us-west1-b| separate|
| part1| canal| debian-8-kubespray| us-east1-b| ha|
| part1| weave| rhel-7| europe-west1-b| default|
| part2| flannel| centos-7| us-west1-a| default|
| part2| calico| debian-8-kubespray| us-central1-b| default|
| part2| canal| coreos-stable| us-east1-b| default|
| special| canal| rhel-7| us-east1-b| separated|
| special| weave| ubuntu-1604-xenial| us-central1-b| separated|
| special| calico| centos-7| europe-west1-b| ha|
| special| weave| coreos-alpha| us-west1-a| ha|
| special| canal| rhel-7| us-east1-b| separate|
| special| weave| ubuntu-1604-xenial| us-central1-b| default|
| special| calico| centos-7| europe-west1-b| ha-scale|
| special| weave| coreos-alpha| us-west1-a| ha-scale|
The "Stage" means a build step of the build pipeline. The steps are ordered as `part1->part2->special`.

View File

@@ -18,7 +18,7 @@ versions. Here are all version vars for each component:
* flannel_version
* kubedns_version
#### Example
#### Unsafe upgrade example
If you wanted to upgrade just kube_version from v1.4.3 to v1.4.6, you could
deploy the following way:
@@ -33,15 +33,29 @@ And then repeat with v1.4.6 as kube_version:
ansible-playbook cluster.yml -i inventory/inventory.cfg -e kube_version=v1.4.6
```
#### Graceful upgrade
Kargo also supports cordon, drain and uncordoning of nodes when performing
a cluster upgrade. There is a separate playbook used for this purpose. It is
important to note that upgrade-cluster.yml can only be used for upgrading an
existing cluster. That means there must be at least 1 kube-master already
deployed.
```
git fetch origin
git checkout origin/master
ansible-playbook upgrade-cluster.yml -b -i inventory/inventory.cfg
```
#### Upgrade order
As mentioned above, components are upgraded in the order in which they were
installed in the Ansible playbook. The order of component installation is as
follows:
# Docker
# etcd
# kubelet and kube-proxy
# network_plugin (such as Calico or Weave)
# kube-apiserver, kube-scheduler, and kube-controller-manager
# Add-ons (such as KubeDNS)
* Docker
* etcd
* kubelet and kube-proxy
* network_plugin (such as Calico or Weave)
* kube-apiserver, kube-scheduler, and kube-controller-manager
* Add-ons (such as KubeDNS)

View File

@@ -23,7 +23,7 @@ Some variables of note include:
* *hyperkube_image_repo* - Specify the Docker repository where Hyperkube
resides
* *hyperkube_image_tag* - Specify the Docker tag where Hyperkube resides
* *kube_network_plugin* - Changes k8s plugin to Calico
* *kube_network_plugin* - Sets k8s network plugin (default Calico)
* *kube_proxy_mode* - Changes k8s proxy mode to iptables mode
* *kube_version* - Specify a given Kubernetes hyperkube version
* *searchdomains* - Array of DNS domains to search when looking up hostnames
@@ -41,8 +41,9 @@ Some variables of note include:
address instead of localhost for kube-masters and kube-master[0] for
kube-nodes. See more details in the
[HA guide](https://github.com/kubernetes-incubator/kargo/blob/master/docs/ha-mode.md).
* *loadbalancer_apiserver_localhost* - If enabled, all hosts will connect to
the apiserver internally load balanced endpoint. See more details in the
* *loadbalancer_apiserver_localhost* - makes all hosts to connect to
the apiserver internally load balanced endpoint. Mutual exclusive to the
`loadbalancer_apiserver`. See more details in the
[HA guide](https://github.com/kubernetes-incubator/kargo/blob/master/docs/ha-mode.md).
#### Cluster variables
@@ -92,9 +93,12 @@ Stack](https://github.com/kubernetes-incubator/kargo/blob/master/docs/dns-stack.
``--insecure-registry=myregistry.mydomain:5000``
* *http_proxy/https_proxy/no_proxy* - Proxy variables for deploying behind a
proxy
* *kubelet_load_modules* - For some things, kubelet needs to load kernel modules. For example,
dynamic kernel services are needed for mounting persistent volumes into containers. These may not be
loaded by preinstall kubernetes processes. For example, ceph and rbd backed volumes. Set this variable to
true to let kubelet load kernel modules.
#### User accounts
Kargo sets up two Kubernetes accounts by default: ``root`` and ``kube``. Their
passwords default to changeme. You can set this by changing ``kube_api_pwd``.

92
docs/vault.md Normal file
View File

@@ -0,0 +1,92 @@
Hashicorp Vault Role
====================
Overview
--------
The Vault role is a two-step process:
1. Bootstrap
You cannot start your certificate management service securely with SSL (and
the datastore behind it) without having the certificates in-hand already. This
presents an unfortunate chicken and egg scenario, with one requiring the other.
To solve for this, the Bootstrap step was added.
This step spins up a temporary instance of Vault to issue certificates for
Vault itself. It then leaves the temporary instance running, so that the Etcd
role can generate certs for itself as well. Eventually, this may be improved
to allow alternate backends (such as Consul), but currently the tasks are
hardcoded to only create a Vault role for Etcd.
2. Cluster
This step is where the long-term Vault cluster is started and configured. Its
first task, is to stop any temporary instances of Vault, to free the port for
the long-term. At the end of this task, the entire Vault cluster should be up
and read to go.
Keys to the Kingdom
-------------------
The two most important security pieces of Vault are the ``root_token``
and ``unsealing_keys``. Both of these values are given exactly once, during
the initialization of the Vault cluster. For convenience, they are saved
to the ``vault_secret_dir`` (default: /etc/vault/secrets) of every host in the
vault group.
It is *highly* recommended that these secrets are removed from the servers after
your cluster has been deployed, and kept in a safe location of your choosing.
Naturally, the seriousness of the situation depends on what you're doing with
your Kargo cluster, but with these secrets, an attacker will have the ability
to authenticate to almost everything in Kubernetes and decode all private
(HTTPS) traffic on your network signed by Vault certificates.
For even greater security, you may want to remove and store elsewhere any
CA keys generated as well (e.g. /etc/vault/ssl/ca-key.pem).
Vault by default encrypts all traffic to and from the datastore backend, all
resting data, and uses TLS for its TCP listener. It is recommended that you
do not change the Vault config to disable TLS, unless you absolutely have to.
Usage
-----
To get the Vault role running, you must to do two things at a minimum:
1. Assign the ``vault`` group to at least 1 node in your inventory
2. Change ``cert_management`` to be ``vault`` instead of ``script``
Nothing else is required, but customization is possible. Check
``roles/vault/defaults/main.yml`` for the different variables that can be
overridden, most common being ``vault_config``, ``vault_port``, and
``vault_deployment_type``.
Also, if you intend to use a Root or Intermediate CA generated elsewhere,
you'll need to copy the certificate and key to the hosts in the vault group
prior to running the vault role. By default, they'll be located at
``/etc/vault/ssl/ca.pem`` and ``/etc/vault/ssl/ca-key.pem``, respectively.
Additional Notes:
- ``groups.vault|first`` is considered the source of truth for Vault variables
- ``vault_leader_url`` is used as pointer for the current running Vault
- Each service should have its own role and credentials. Currently those
credentials are saved to ``/etc/vault/roles/<role>/``. The service will
need to read in those credentials, if they want to interact with Vault.
Potential Work
--------------
- Change the Vault role to not run certain tasks when ``root_token`` and
``unseal_keys`` are not present. Alternatively, allow user input for these
values when missing.
- Add the ability to start temp Vault with Host, Rkt, or Docker
- Add a dynamic way to change out the backend role creation during Bootstrap,
so other services can be used (such as Consul)
- Segregate Server Cert generation from Auth Cert generation (separate CAs).
This work was partially started with the `auth_cert_backend` tasks, but would
need to be further applied to all roles (particularly Etcd and Kubernetes).

View File

@@ -1,165 +1,59 @@
# Valid bootstrap options (required): ubuntu, coreos, centos, none
bootstrap_os: none
## The access_ip variable is used to define how other nodes should access
## the node. This is used in flannel to allow other flannel nodes to see
## this node for example. The access_ip is really useful AWS and Google
## environments where the nodes are accessed remotely by the "public" ip,
## but don't know about that address themselves.
#access_ip: 1.1.1.1
# Directory where the binaries will be installed
bin_dir: /usr/local/bin
### LOADBALANCING AND ACCESS MODES
## Enable multiaccess to configure etcd clients to access all of the etcd members directly
## as the "http://hostX:port, http://hostY:port, ..." and ignore the proxy loadbalancers.
## This may be the case if clients support and loadbalance multiple etcd servers natively.
#etcd_multiaccess: true
# Kubernetes configuration dirs and system namespace.
# Those are where all the additional config stuff goes
# the kubernetes normally puts in /srv/kubernets.
# This puts them in a sane location and namespace.
# Editting those values will almost surely break something.
kube_config_dir: /etc/kubernetes
kube_script_dir: "{{ bin_dir }}/kubernetes-scripts"
kube_manifest_dir: "{{ kube_config_dir }}/manifests"
system_namespace: kube-system
## External LB example config
## apiserver_loadbalancer_domain_name: "elb.some.domain"
#loadbalancer_apiserver:
# address: 1.2.3.4
# port: 1234
# This is where all the cert scripts and certs will be located
kube_cert_dir: "{{ kube_config_dir }}/ssl"
## Internal loadbalancers for apiservers
#loadbalancer_apiserver_localhost: true
# This is where all of the bearer tokens will be stored
kube_token_dir: "{{ kube_config_dir }}/tokens"
## Local loadbalancer should use this port instead, if defined.
## Defaults to kube_apiserver_port (6443)
#nginx_kube_apiserver_port: 8443
# This is where to save basic auth file
kube_users_dir: "{{ kube_config_dir }}/users"
### OTHER OPTIONAL VARIABLES
## For some things, kubelet needs to load kernel modules. For example, dynamic kernel services are needed
## for mounting persistent volumes into containers. These may not be loaded by preinstall kubernetes
## processes. For example, ceph and rbd backed volumes. Set to true to allow kubelet to load kernel
## modules.
# kubelet_load_modules: false
## Change this to use another Kubernetes version, e.g. a current beta release
kube_version: v1.5.1
## Internal network total size. This is the prefix of the
## entire network. Must be unused in your environment.
#kube_network_prefix: 18
# Where the binaries will be downloaded.
# Note: ensure that you've enough disk space (about 1G)
local_release_dir: "/tmp/releases"
# Random shifts for retrying failed ops like pushing/downloading
retry_stagger: 5
# Uncomment this line for CoreOS only.
# Directory where python binary is installed
# ansible_python_interpreter: "/opt/bin/python"
# This is the group that the cert creation scripts chgrp the
# cert files to. Not really changable...
kube_cert_group: kube-cert
# Cluster Loglevel configuration
kube_log_level: 2
# Kubernetes 1.5 added a new flag to the apiserver to disable anonymous auth. In previos versions, anonymous auth was
# not implemented. As the new flag defaults to true, we have to explicetely disable it. Change this line if you want the
# 1.5 default behavior. The flag is actually only added if the used kubernetes version is >= 1.5
kube_api_anonymous_auth: false
# Users to create for basic auth in Kubernetes API via HTTP
kube_api_pwd: "changeme"
kube_users:
kube:
pass: "{{kube_api_pwd}}"
role: admin
root:
pass: "changeme"
role: admin
# Kubernetes cluster name, also will be used as DNS domain
cluster_name: cluster.local
# Subdomains of DNS domain to be resolved via /etc/resolv.conf for hostnet pods
ndots: 2
# Deploy netchecker app to verify DNS resolve as an HTTP service
deploy_netchecker: false
# For some environments, each node has a pubilcally accessible
# address and an address it should bind services to. These are
# really inventory level variables, but described here for consistency.
#
# When advertising access, the access_ip will be used, but will defer to
# ip and then the default ansible ip when unspecified.
#
# When binding to restrict access, the ip variable will be used, but will
# defer to the default ansible ip when unspecified.
#
# The ip variable is used for specific address binding, e.g. listen address
# for etcd. This is use to help with environments like Vagrant or multi-nic
# systems where one address should be preferred over another.
# ip: 10.2.2.2
#
# The access_ip variable is used to define how other nodes should access
# the node. This is used in flannel to allow other flannel nodes to see
# this node for example. The access_ip is really useful AWS and Google
# environments where the nodes are accessed remotely by the "public" ip,
# but don't know about that address themselves.
# access_ip: 1.1.1.1
# Etcd access modes:
# Enable multiaccess to configure clients to access all of the etcd members directly
# as the "http://hostX:port, http://hostY:port, ..." and ignore the proxy loadbalancers.
# This may be the case if clients support and loadbalance multiple etcd servers natively.
etcd_multiaccess: true
# Assume there are no internal loadbalancers for apiservers exist and listen on
# kube_apiserver_port (default 443)
loadbalancer_apiserver_localhost: true
# Choose network plugin (calico, weave or flannel)
# Can also be set to 'cloud', which lets the cloud provider setup appropriate routing
kube_network_plugin: flannel
# Kubernetes internal network for services, unused block of space.
kube_service_addresses: 10.233.0.0/18
# internal network. When used, it will assign IP
# addresses from this range to individual pods.
# This network must be unused in your network infrastructure!
kube_pods_subnet: 10.233.64.0/18
# internal network total size (optional). This is the prefix of the
# entire network. Must be unused in your environment.
# kube_network_prefix: 18
# internal network node size allocation (optional). This is the size allocated
# to each node on your network. With these defaults you should have
# room for 4096 nodes with 254 pods per node.
kube_network_node_prefix: 24
# With calico it is possible to distributed routes with border routers of the datacenter.
peer_with_router: false
# Warning : enabling router peering will disable calico's default behavior ('node mesh').
# The subnets of each nodes will be distributed by the datacenter router
# The port the API Server will be listening on.
kube_apiserver_ip: "{{ kube_service_addresses|ipaddr('net')|ipaddr(1)|ipaddr('address') }}"
kube_apiserver_port: 443 # (https)
kube_apiserver_insecure_port: 8080 # (http)
# Internal DNS configuration.
# Kubernetes can create and mainatain its own DNS server to resolve service names
# into appropriate IP addresses. It's highly advisable to run such DNS server,
# as it greatly simplifies configuration of your applications - you can use
# service names instead of magic environment variables.
# Can be dnsmasq_kubedns, kubedns or none
dns_mode: dnsmasq_kubedns
# Can be docker_dns, host_resolvconf or none
resolvconf_mode: docker_dns
## With calico it is possible to distributed routes with border routers of the datacenter.
## Warning : enabling router peering will disable calico's default behavior ('node mesh').
## The subnets of each nodes will be distributed by the datacenter router
#peer_with_router: false
## Upstream dns servers used by dnsmasq
#upstream_dns_servers:
# - 8.8.8.8
# - 8.8.4.4
dns_domain: "{{ cluster_name }}"
## There are some changes specific to the cloud providers
## for instance we need to encapsulate packets with some network plugins
## If set the possible values are either 'gce', 'aws', 'azure', 'openstack', or 'vsphere'
## When openstack is used make sure to source in the openstack credentials
## like you would do when using nova-client before starting the playbook.
#cloud_provider:
# Ip address of the kubernetes skydns service
skydns_server: "{{ kube_service_addresses|ipaddr('net')|ipaddr(3)|ipaddr('address') }}"
dns_server: "{{ kube_service_addresses|ipaddr('net')|ipaddr(2)|ipaddr('address') }}"
# There are some changes specific to the cloud providers
# for instance we need to encapsulate packets with some network plugins
# If set the possible values are either 'gce', 'aws', 'azure' or 'openstack'
# When openstack is used make sure to source in the openstack credentials
# like you would do when using nova-client before starting the playbook.
# When azure is used, you need to also set the following variables.
# cloud_provider:
# see docs/azure.md for details on how to get these values
## When azure is used, you need to also set the following variables.
## see docs/azure.md for details on how to get these values
#azure_tenant_id:
#azure_subscription_id:
#azure_aad_client_id:
@@ -171,34 +65,24 @@ dns_server: "{{ kube_service_addresses|ipaddr('net')|ipaddr(2)|ipaddr('address')
#azure_vnet_name:
#azure_route_table_name:
## Set these proxy values in order to update docker daemon to use proxies
# http_proxy: ""
# https_proxy: ""
# no_proxy: ""
# Path used to store Docker data
docker_daemon_graph: "/var/lib/docker"
## A string of extra options to pass to the docker daemon.
## This string should be exactly as you wish it to appear.
## An obvious use case is allowing insecure-registry access
## to self hosted registries like so:
docker_options: "--insecure-registry={{ kube_service_addresses }} --graph={{ docker_daemon_graph }}"
docker_bin_dir: "/usr/bin"
#http_proxy: ""
#https_proxy: ""
#no_proxy: ""
## Uncomment this if you want to force overlay/overlay2 as docker storage driver
## Please note that overlay2 is only supported on newer kernels
#docker_storage_options: -s overlay2
# K8s image pull policy (imagePullPolicy)
k8s_image_pull_policy: IfNotPresent
# default packages to install within the cluster
kpm_packages: []
## Default packages to install within the cluster, f.e:
#kpm_packages:
# - name: kube-system/grafana
# Settings for containerized control plane (etcd/kubelet)
rkt_version: 1.21.0
etcd_deployment_type: docker
kubelet_deployment_type: docker
## Certificate Management
## This setting determines whether certs are generated via scripts or whether a
## cluster of Hashicorp's Vault is started to issue certificates (using etcd
## as a backend). Options are "script" or "vault"
#cert_management: script
## Please specify true if you want to perform a kernel upgrade
kernel_upgrade: false

View File

@@ -0,0 +1,143 @@
# Valid bootstrap options (required): ubuntu, coreos, centos, none
bootstrap_os: none
#Directory where etcd data stored
etcd_data_dir: /var/lib/etcd
# Directory where the binaries will be installed
bin_dir: /usr/local/bin
# Kubernetes configuration dirs and system namespace.
# Those are where all the additional config stuff goes
# the kubernetes normally puts in /srv/kubernets.
# This puts them in a sane location and namespace.
# Editting those values will almost surely break something.
kube_config_dir: /etc/kubernetes
kube_script_dir: "{{ bin_dir }}/kubernetes-scripts"
kube_manifest_dir: "{{ kube_config_dir }}/manifests"
system_namespace: kube-system
# Logging directory (sysvinit systems)
kube_log_dir: "/var/log/kubernetes"
# This is where all the cert scripts and certs will be located
kube_cert_dir: "{{ kube_config_dir }}/ssl"
# This is where all of the bearer tokens will be stored
kube_token_dir: "{{ kube_config_dir }}/tokens"
# This is where to save basic auth file
kube_users_dir: "{{ kube_config_dir }}/users"
kube_api_anonymous_auth: false
## Change this to use another Kubernetes version, e.g. a current beta release
kube_version: v1.5.3
# Where the binaries will be downloaded.
# Note: ensure that you've enough disk space (about 1G)
local_release_dir: "/tmp/releases"
# Random shifts for retrying failed ops like pushing/downloading
retry_stagger: 5
# This is the group that the cert creation scripts chgrp the
# cert files to. Not really changable...
kube_cert_group: kube-cert
# Cluster Loglevel configuration
kube_log_level: 2
# Users to create for basic auth in Kubernetes API via HTTP
kube_api_pwd: "changeme"
kube_users:
kube:
pass: "{{kube_api_pwd}}"
role: admin
root:
pass: "{{kube_api_pwd}}"
role: admin
## It is possible to activate / deactivate selected authentication methods (basic auth, static token auth)
#kube_oidc_auth: false
#kube_basic_auth: false
#kube_token_auth: false
## Variables for OpenID Connect Configuration https://kubernetes.io/docs/admin/authentication/
## To use OpenID you have to deploy additional an OpenID Provider (e.g Dex, Keycloak, ...)
# kube_oidc_url: https:// ...
# kube_oidc_client_id: kubernetes
## Optional settings for OIDC
# kube_oidc_ca_file: {{ kube_cert_dir }}/ca.pem
# kube_oidc_username_claim: sub
# kube_oidc_groups_claim: groups
# Choose network plugin (calico, weave or flannel)
# Can also be set to 'cloud', which lets the cloud provider setup appropriate routing
kube_network_plugin: calico
# Enable kubernetes network policies
enable_network_policy: false
# Kubernetes internal network for services, unused block of space.
kube_service_addresses: 10.233.0.0/18
# internal network. When used, it will assign IP
# addresses from this range to individual pods.
# This network must be unused in your network infrastructure!
kube_pods_subnet: 10.233.64.0/18
# internal network node size allocation (optional). This is the size allocated
# to each node on your network. With these defaults you should have
# room for 4096 nodes with 254 pods per node.
kube_network_node_prefix: 24
# The port the API Server will be listening on.
kube_apiserver_ip: "{{ kube_service_addresses|ipaddr('net')|ipaddr(1)|ipaddr('address') }}"
kube_apiserver_port: 6443 # (https)
kube_apiserver_insecure_port: 8080 # (http)
# DNS configuration.
# Kubernetes cluster name, also will be used as DNS domain
cluster_name: cluster.local
# Subdomains of DNS domain to be resolved via /etc/resolv.conf for hostnet pods
ndots: 2
# Can be dnsmasq_kubedns, kubedns or none
dns_mode: dnsmasq_kubedns
# Can be docker_dns, host_resolvconf or none
resolvconf_mode: docker_dns
# Deploy netchecker app to verify DNS resolve as an HTTP service
deploy_netchecker: false
# Ip address of the kubernetes skydns service
skydns_server: "{{ kube_service_addresses|ipaddr('net')|ipaddr(3)|ipaddr('address') }}"
dns_server: "{{ kube_service_addresses|ipaddr('net')|ipaddr(2)|ipaddr('address') }}"
dns_domain: "{{ cluster_name }}"
# Path used to store Docker data
docker_daemon_graph: "/var/lib/docker"
## A string of extra options to pass to the docker daemon.
## This string should be exactly as you wish it to appear.
## An obvious use case is allowing insecure-registry access
## to self hosted registries like so:
docker_options: "--insecure-registry={{ kube_service_addresses }} --graph={{ docker_daemon_graph }}"
docker_bin_dir: "/usr/bin"
# Settings for containerized control plane (etcd/kubelet/secrets)
etcd_deployment_type: docker
kubelet_deployment_type: docker
cert_management: script
vault_deployment_type: docker
# K8s image pull policy (imagePullPolicy)
k8s_image_pull_policy: IfNotPresent
# Monitoring apps for k8s
efk_enabled: false
# Helm deployment
helm_enabled: false

View File

@@ -7,7 +7,7 @@
# node5 ansible_ssh_host=95.54.0.16 # ip=10.3.0.5
# node6 ansible_ssh_host=95.54.0.17 # ip=10.3.0.6
# ## configure a bastion host if your nodes are not publicly reachable
# ## configure a bastion host if your nodes are not directly reachable
# bastion ansible_ssh_host=x.x.x.x
# [kube-master]

View File

@@ -1,2 +1,14 @@
ansible
ansible==2.2.1.0
netaddr
# Ansible 2.2.1 requires jinja2<2.9, see <https://github.com/ansible/ansible/blob/v2.2.1.0-1/setup.py#L25>,
# but without explicit limiting upper jinja2 version here pip ignores
# Ansible requirements and installs latest available jinja2
# (pip is not very smart here), which is incompatible with with
# Ansible 2.2.1.
# With incompatible jinja2 version "ansible-vault create" (and probably other parts)
# fails with:
# ERROR! Unexpected Exception: The 'jinja2<2.9' distribution was not found
# and is required by ansible
# This upper limit should be removed in 2.2.2 release, see:
# <https://github.com/ansible/ansible/commit/978311bf3f91dae5806ab72b665b0937adce38ad>
jinja2>=2.8,<2.9

View File

@@ -1,5 +1,18 @@
---
- hosts: all
vars_prompt:
name: "reset_confirmation"
prompt: "Are you sure you want to reset cluster state? Type 'yes' to reset your cluster."
default: "no"
private: no
pre_tasks:
- name: check confirmation
fail:
msg: "Reset confirmation failed"
when: reset_confirmation != "yes"
roles:
- { role: kargo-defaults}
- { role: reset, tags: reset }

View File

@@ -1,10 +1,13 @@
---
kube_cert_group: kube-cert
etcd_data_dir: "/var/lib/etcd"
addusers:
etcd:
name: etcd
comment: "Etcd user"
createhome: yes
home: "/var/lib/etcd"
home: "{{ etcd_data_dir }}"
system: yes
shell: /bin/nologin
kube:

View File

@@ -1,6 +1,8 @@
---
- name: User | Create User Group
group: name={{user.group|default(user.name)}} system={{user.system|default(omit)}}
group:
name: "{{user.group|default(user.name)}}"
system: "{{user.system|default(omit)}}"
- name: User | Create User
user:

View File

@@ -3,7 +3,7 @@ addusers:
- name: etcd
comment: "Etcd user"
createhome: yes
home: "/var/lib/etcd"
home: "{{ etcd_data_dir }}"
system: yes
shell: /bin/nologin

View File

@@ -3,7 +3,7 @@ addusers:
- name: etcd
comment: "Etcd user"
createhome: yes
home: "/var/lib/etcd"
home: "{{ etcd_data_dir }}"
system: yes
shell: /bin/nologin

View File

@@ -15,4 +15,6 @@
- name: create ssh bastion conf
become: false
template: src=ssh-bastion.conf dest="{{ playbook_dir }}/ssh-bastion.conf"
template:
src: ssh-bastion.conf
dest: "{{ playbook_dir }}/ssh-bastion.conf"

View File

@@ -4,7 +4,7 @@
{% for h in groups['all'] %}
{% if h != 'bastion' %}
{% if vars.update({'hosts': vars['hosts'] + ' ' + hostvars[h]['ansible_ssh_host']}) %}{% endif %}
{% if vars.update({'hosts': vars['hosts'] + ' ' + (hostvars[h].get('ansible_ssh_host') or hostvars[h]['ansible_host'])}) %}{% endif %}
{% endif %}
{% endfor %}
@@ -18,4 +18,4 @@ Host {{ bastion_ip }}
Host {{ vars['hosts'] }}
ProxyCommand ssh -W %h:%p {{ real_user }}@{{ bastion_ip }}
StrictHostKeyChecking no
{% endif %}
{% endif %}

View File

@@ -1,7 +1,8 @@
---
- name: Check presence of fastestmirror.conf
stat: path=/etc/yum/pluginconf.d/fastestmirror.conf
stat:
path: /etc/yum/pluginconf.d/fastestmirror.conf
register: fastestmirror
# fastestmirror plugin actually slows down Ansible deployments

View File

@@ -18,11 +18,14 @@
register: need_pip
failed_when: false
changed_when: false
check_mode: no
when: (need_bootstrap | failed)
tags: facts
- name: Bootstrap | Copy get-pip.py
copy: src=get-pip.py dest=~/get-pip.py
copy:
src: get-pip.py
dest: ~/get-pip.py
when: (need_pip | failed)
- name: Bootstrap | Install pip
@@ -30,11 +33,16 @@
when: (need_pip | failed)
- name: Bootstrap | Remove get-pip.py
file: path=~/get-pip.py state=absent
file:
path: ~/get-pip.py
state: absent
when: (need_pip | failed)
- name: Bootstrap | Install pip launcher
copy: src=runner dest=/opt/bin/pip mode=0755
copy:
src: runner
dest: /opt/bin/pip
mode: 0755
when: (need_pip | failed)
- name: Install required python modules
@@ -42,10 +50,3 @@
name: "{{ item }}"
with_items: "{{pip_python_modules}}"
- name: Check configured hostname
shell: hostname
register: configured_hostname
- name: Assign inventory name to unconfigured hostnames
shell: sh -c "echo \"{{inventory_hostname}}\" > /etc/hostname; hostname \"{{inventory_hostname}}\""
when: (configured_hostname.stdout == 'localhost')

View File

@@ -2,14 +2,20 @@
# raw: cat /etc/issue.net | grep '{{ bootstrap_versions }}'
- name: Bootstrap | Check if bootstrap is needed
raw: which python
raw: which "{{ item }}"
register: need_bootstrap
failed_when: false
with_items:
- python
- pip
tags: facts
- name: Bootstrap | Install python 2.x
raw: apt-get update && DEBIAN_FRONTEND=noninteractive apt-get install -y python-minimal
when: need_bootstrap | failed
- name: Bootstrap | Install python 2.x and pip
raw:
apt-get update && \
DEBIAN_FRONTEND=noninteractive apt-get install -y python-minimal python-pip
when:
"{{ need_bootstrap.results | map(attribute='rc') | sort | last | bool }}"
- set_fact:
ansible_python_interpreter: "/usr/bin/python"

View File

@@ -8,4 +8,23 @@
- include: bootstrap-centos.yml
when: bootstrap_os == "centos"
- include: setup-pipelining.yml
- include: setup-pipelining.yml
- name: check if atomic host
stat:
path: /run/ostree-booted
register: ostree
- set_fact:
is_atomic: "{{ ostree.stat.exists }}"
- name: Gather nodes hostnames
setup:
gather_subset: '!all'
filter: ansible_hostname
- name: Assign inventory name to unconfigured hostnames
hostname:
name: "{{inventory_hostname}}"
when: ansible_hostname == 'localhost'

View File

@@ -2,5 +2,8 @@
# Remove requiretty to make ssh pipelining work
- name: Remove require tty
lineinfile: regexp="^\w+\s+requiretty" dest=/etc/sudoers state=absent
lineinfile:
regexp: '^\w+\s+requiretty'
dest: /etc/sudoers
state: absent

View File

@@ -11,6 +11,9 @@
#nameservers:
# - 127.0.0.1
dns_forward_max: 150
cache_size: 1000
# Versions
dnsmasq_version: 2.72
@@ -21,5 +24,9 @@ dnsmasq_image_tag: "{{ dnsmasq_version }}"
# Limits for dnsmasq/kubedns apps
dns_cpu_limit: 100m
dns_memory_limit: 170Mi
dns_cpu_requests: 70m
dns_memory_requests: 70Mi
dns_cpu_requests: 40m
dns_memory_requests: 50Mi
# Autoscaler parameters
dnsmasq_nodes_per_replica: 10
dnsmasq_min_replicas: 1

View File

@@ -1,4 +1,6 @@
---
- include: pre_upgrade.yml
- name: ensure dnsmasq.d directory exists
file:
path: /etc/dnsmasq.d
@@ -11,15 +13,36 @@
state: directory
tags: bootstrap-os
- name: check system nameservers
shell: awk '/^nameserver/ {print $NF}' /etc/resolv.conf
changed_when: False
register: system_nameservers
- name: init system_and_upstream_dns_servers
set_fact:
system_and_upstream_dns_servers: "{{ upstream_dns_servers|default([]) }}"
- name: combine upstream_dns_servers and system nameservers (only for docker_dns)
set_fact:
system_and_upstream_dns_servers: "{{ system_and_upstream_dns_servers | union(system_nameservers.stdout_lines) | unique }}"
when: system_nameservers.stdout != "" and resolvconf_mode != 'host_resolvconf'
- name: Write dnsmasq configuration
template:
src: 01-kube-dns.conf.j2
dest: /etc/dnsmasq.d-available/01-kube-dns.conf
mode: 0755
backup: yes
register: dnsmasq_config
- name: Stat dnsmasq configuration
stat: path=/etc/dnsmasq.d/01-kube-dns.conf
- name: Stat dnsmasq link
stat:
path: /etc/dnsmasq.d-available/01-kube-dns.conf
register: dnsmasq_stat
- name: Stat dnsmasq link
stat:
path: /etc/dnsmasq.d/01-kube-dns.conf
register: sym
- name: Move previous configuration
@@ -34,16 +57,19 @@
state: link
- name: Create dnsmasq manifests
template: src={{item.file}} dest={{kube_config_dir}}/{{item.file}}
template:
src: "{{item.file}}"
dest: "{{kube_config_dir}}/{{item.file}}"
with_items:
- {file: dnsmasq-ds.yml, type: ds}
- {file: dnsmasq-svc.yml, type: svc}
- {name: dnsmasq, file: dnsmasq-deploy.yml, type: deployment}
- {name: dnsmasq, file: dnsmasq-svc.yml, type: svc}
- {name: dnsmasq-autoscaler, file: dnsmasq-autoscaler.yml, type: deployment}
register: manifests
when: inventory_hostname == groups['kube-master'][0]
- name: Start Resources
kube:
name: dnsmasq
name: "{{item.item.name}}"
namespace: "{{system_namespace}}"
kubectl: "{{bin_dir}}/kubectl"
resource: "{{item.item.type}}"
@@ -56,6 +82,6 @@
wait_for:
host: "{{dns_server}}"
port: 53
delay: 5
when: inventory_hostname == groups['kube-node'][0]
tags: facts
timeout: 180
when: inventory_hostname == groups['kube-node'][0] and groups['kube-node'][0] in ansible_play_hosts

View File

@@ -0,0 +1,9 @@
---
- name: Delete legacy dnsmasq daemonset
kube:
name: dnsmasq
namespace: "{{system_namespace}}"
kubectl: "{{bin_dir}}/kubectl"
resource: "ds"
state: absent
when: inventory_hostname == groups['kube-master'][0]

View File

@@ -11,23 +11,23 @@ server=/{{ dns_domain }}/{{ skydns_server }}
local=/{{ bogus_domains }}
#Set upstream dns servers
{% if upstream_dns_servers is defined %}
{% for srv in upstream_dns_servers %}
{% if system_and_upstream_dns_servers|length > 0 %}
{% for srv in system_and_upstream_dns_servers %}
server={{ srv }}
{% endfor %}
no-resolv
{% elif resolvconf_mode == 'host_resolvconf' %}
{# The default resolver is only needed when the hosts resolv.conf was modified by us. If it was not modified, we can rely on dnsmasq to reuse the systems resolv.conf #}
server={{ default_resolver }}
no-resolv
{% endif %}
{% if kube_log_level == '4' %}
log-queries
{% endif %}
no-resolv
bogus-priv
no-negcache
cache-size=1000
cache-size={{ cache_size }}
dns-forward-max={{ dns_forward_max }}
max-cache-ttl=10
max-ttl=20
log-facility=-

View File

@@ -0,0 +1,50 @@
# Copyright 2016 The Kubernetes Authors.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
apiVersion: extensions/v1beta1
kind: Deployment
metadata:
name: dnsmasq-autoscaler
namespace: kube-system
labels:
k8s-app: dnsmasq-autoscaler
kubernetes.io/cluster-service: "true"
addonmanager.kubernetes.io/mode: Reconcile
spec:
template:
metadata:
labels:
k8s-app: dnsmasq-autoscaler
annotations:
scheduler.alpha.kubernetes.io/critical-pod: ''
scheduler.alpha.kubernetes.io/tolerations: '[{"key":"CriticalAddonsOnly", "operator":"Exists"}]'
spec:
containers:
- name: autoscaler
image: gcr.io/google_containers/cluster-proportional-autoscaler-amd64:1.1.1
resources:
requests:
cpu: "20m"
memory: "10Mi"
command:
- /cluster-proportional-autoscaler
- --namespace=kube-system
- --configmap=dnsmasq-autoscaler
- --target=ReplicationController/dnsmasq
# When cluster is using large nodes(with more cores), "coresPerReplica" should dominate.
# If using small nodes, "nodesPerReplica" should dominate.
- --default-params={"linear":{"nodesPerReplica":{{ dnsmasq_nodes_per_replica }},"preventSinglePointFailure":true}}
- --logtostderr=true
- --v={{ kube_log_level }}

View File

@@ -1,16 +1,25 @@
---
apiVersion: extensions/v1beta1
kind: DaemonSet
kind: Deployment
metadata:
name: dnsmasq
namespace: "{{system_namespace}}"
labels:
k8s-app: dnsmasq
kubernetes.io/cluster-service: "true"
spec:
replicas: {{ dnsmasq_min_replicas }}
selector:
matchLabels:
k8s-app: dnsmasq
strategy:
type: "Recreate"
template:
metadata:
labels:
k8s-app: dnsmasq
kubernetes.io/cluster-service: "true"
kargo/dnsmasq-checksum: "{{ dnsmasq_stat.stat.checksum }}"
spec:
containers:
- name: dnsmasq
@@ -20,8 +29,8 @@ spec:
- dnsmasq
args:
- -k
- "-7"
- /etc/dnsmasq.d
- -C
- /etc/dnsmasq.d/01-kube-dns.conf
securityContext:
capabilities:
add:
@@ -55,3 +64,4 @@ spec:
hostPath:
path: /etc/dnsmasq.d-available
dnsPolicy: Default # Don't use cluster DNS.

View File

@@ -1,4 +1,4 @@
docker_version: '1.12'
docker_version: '1.13'
docker_package_info:
pkgs:

View File

@@ -23,7 +23,9 @@
state: restarted
- name: Docker | pause while Docker restarts
pause: seconds=10 prompt="Waiting for docker restart"
pause:
seconds: 10
prompt: "Waiting for docker restart"
- name: Docker | wait for docker
command: "{{ docker_bin_dir }}/docker images"

View File

@@ -34,11 +34,11 @@
keyserver: "{{docker_repo_key_info.keyserver}}"
state: present
register: keyserver_task_result
until: keyserver_task_result|success
until: keyserver_task_result|succeeded
retries: 4
delay: "{{ retry_stagger | random + 3 }}"
with_items: "{{ docker_repo_key_info.repo_keys }}"
when: not ansible_os_family in ["CoreOS", "Container Linux by CoreOS"]
when: not (ansible_os_family in ["CoreOS", "Container Linux by CoreOS"] or is_atomic)
- name: ensure docker repository is enabled
action: "{{ docker_repo_info.pkg_repo }}"
@@ -46,13 +46,13 @@
repo: "{{item}}"
state: present
with_items: "{{ docker_repo_info.repos }}"
when: (not ansible_os_family in ["CoreOS", "Container Linux by CoreOS"]) and (docker_repo_info.repos|length > 0)
when: not (ansible_os_family in ["CoreOS", "Container Linux by CoreOS"] or is_atomic) and (docker_repo_info.repos|length > 0)
- name: Configure docker repository on RedHat/CentOS
copy:
src: "rh_docker.repo"
template:
src: "rh_docker.repo.j2"
dest: "/etc/yum.repos.d/docker.repo"
when: ansible_distribution in ["CentOS","RedHat"]
when: ansible_distribution in ["CentOS","RedHat"] and not is_atomic
- name: ensure docker packages are installed
action: "{{ docker_package_info.pkg_mgr }}"
@@ -61,14 +61,15 @@
force: "{{item.force|default(omit)}}"
state: present
register: docker_task_result
until: docker_task_result|success
until: docker_task_result|succeeded
retries: 4
delay: "{{ retry_stagger | random + 3 }}"
with_items: "{{ docker_package_info.pkgs }}"
when: (not ansible_os_family in ["CoreOS", "Container Linux by CoreOS"]) and (docker_package_info.pkgs|length > 0)
notify: restart docker
when: not (ansible_os_family in ["CoreOS", "Container Linux by CoreOS"] or is_atomic) and (docker_package_info.pkgs|length > 0)
- name: check minimum docker version for docker_dns mode. You need at least docker version >= 1.12 for resolvconf_mode=docker_dns
shell: docker version -f "{{ '{{' }}.Client.Version{{ '}}' }}"
command: "docker version -f '{{ '{{' }}.Client.Version{{ '}}' }}'"
register: docker_version
failed_when: docker_version.stdout|version_compare('1.12', '<')
changed_when: false

View File

@@ -32,11 +32,13 @@
shell: grep "^nameserver" /etc/resolv.conf | sed 's/^nameserver\s*//'
changed_when: False
register: system_nameservers
check_mode: no
- name: check system search domains
shell: grep "^search" /etc/resolv.conf | sed 's/^search\s*//'
changed_when: False
register: system_search_domains
check_mode: no
- name: add system nameservers to docker options
set_fact:
@@ -49,13 +51,16 @@
when: system_search_domains.stdout != ""
- name: check number of nameservers
fail: msg="Too many nameservers"
fail:
msg: "Too many nameservers"
when: docker_dns_servers|length > 3
- name: check number of search domains
fail: msg="Too many search domains"
fail:
msg: "Too many search domains"
when: docker_dns_search_domains|length > 6
- name: check length of search domains
fail: msg="Search domains exceeded limit of 256 characters"
fail:
msg: "Search domains exceeded limit of 256 characters"
when: docker_dns_search_domains|join(' ')|length > 256

View File

@@ -1,6 +1,8 @@
---
- name: Create docker service systemd directory if it doesn't exist
file: path=/etc/systemd/system/docker.service.d state=directory
file:
path: /etc/systemd/system/docker.service.d
state: directory
- name: Write docker proxy drop-in
template:
@@ -13,7 +15,14 @@
src: docker.service.j2
dest: /etc/systemd/system/docker.service
register: docker_service_file
when: not ansible_os_family in ["CoreOS", "Container Linux by CoreOS"]
when: not (ansible_os_family in ["CoreOS", "Container Linux by CoreOS"] or is_atomic)
- name: Write docker.service systemd file for atomic
template:
src: docker_atomic.service.j2
dest: /etc/systemd/system/docker.service
notify: restart docker
when: is_atomic
- name: Write docker options systemd drop-in
template:

View File

@@ -3,4 +3,4 @@ Environment="DOCKER_DNS_OPTIONS=\
{% for d in docker_dns_servers %}--dns {{ d }} {% endfor %} \
{% for d in docker_dns_search_domains %}--dns-search {{ d }} {% endfor %} \
{% for o in docker_dns_options %}--dns-opt {{ o }} {% endfor %} \
"
"

View File

@@ -1,2 +1,2 @@
[Service]
Environment="DOCKER_OPTS={% if docker_options is defined %}{{ docker_options }}{% endif %}"
Environment="DOCKER_OPTS={% if docker_options is defined %}{{ docker_options }}{% endif %} --iptables={% if kube_network_plugin == 'flannel' %}true{% else %}false{% endif %}"

View File

@@ -0,0 +1,37 @@
[Unit]
Description=Docker Application Container Engine
Documentation=http://docs.docker.com
After=network.target
Wants=docker-storage-setup.service
[Service]
Type=notify
NotifyAccess=all
EnvironmentFile=-/etc/sysconfig/docker
EnvironmentFile=-/etc/sysconfig/docker-storage
Environment=GOTRACEBACK=crash
Environment=DOCKER_HTTP_HOST_COMPAT=1
Environment=PATH=/usr/libexec/docker:/usr/bin:/usr/sbin
ExecReload=/bin/kill -s HUP $MAINPID
Delegate=yes
KillMode=process
ExecStart=/usr/bin/dockerd-current \
--add-runtime docker-runc=/usr/libexec/docker/docker-runc-current \
--default-runtime=docker-runc \
--exec-opt native.cgroupdriver=systemd \
--userland-proxy-path=/usr/libexec/docker/docker-proxy-current \
$DOCKER_OPTS \
$DOCKER_STORAGE_OPTIONS \
$DOCKER_NETWORK_OPTIONS \
$DOCKER_DNS_OPTIONS \
$ADD_REGISTRY \
$BLOCK_REGISTRY \
$INSECURE_REGISTRY
LimitNOFILE=1048576
LimitNPROC=1048576
LimitCORE=infinity
TimeoutStartSec=1min
Restart=on-abnormal
[Install]
WantedBy=multi-user.target

View File

@@ -4,3 +4,4 @@ baseurl=https://yum.dockerproject.org/repo/main/centos/7
enabled=1
gpgcheck=1
gpgkey=https://yum.dockerproject.org/gpg
{% if http_proxy is defined %}proxy={{ http_proxy }}{% endif %}

View File

@@ -4,7 +4,10 @@ docker_kernel_min_version: '3.10'
docker_versioned_pkg:
'latest': docker-engine
'1.11': docker-engine=1.11.2-0~{{ ansible_distribution_release|lower }}
'1.12': docker-engine=1.12.5-0~debian-{{ ansible_distribution_release|lower }}
'1.12': docker-engine=1.12.6-0~debian-{{ ansible_distribution_release|lower }}
'1.13': docker-engine=1.13.1-0~debian-{{ ansible_distribution_release|lower }}
'stable': docker-engine=17.03.0~ce-0~debian-{{ ansible_distribution_release|lower }}
'edge': docker-engine=17.03.0~ce-0~debian-{{ ansible_distribution_release|lower }}
docker_package_info:
pkg_mgr: apt

View File

@@ -1,9 +1,15 @@
docker_kernel_min_version: '0'
# https://docs.docker.com/engine/installation/linux/fedora/#install-from-a-package
# https://download.docker.com/linux/fedora/7/x86_64/stable/
# the package names below are guesses;
# docs mention `sudo dnf config-manager --enable docker-ce-edge` for edge
docker_versioned_pkg:
'latest': docker
'1.11': docker-1:1.11.2
'1.12': docker-1:1.12.5
'stable': docker-ce
'edge': docker-ce-edge
docker_package_info:
pkg_mgr: dnf

Some files were not shown because too many files have changed in this diff Show More