[K8S] 认证集群搭建

  • Post author:
  • Post category:其他




注意:


该文章 是 根据 https://github.com/opsnull/follow-me-install-kubernetes-cluster    (


follow-me-install-kubernetes-cluster




) 一步步做的。



若有版权问题请留言,谢谢!

其中自己根据实际情况做了若干变动。


里面的


ip


与实际对应如下


(


有的地方


ip


或许没改


)



10.64.3.7

192.168.1.206

etcd-host0

10.64.3.8

192.168.1.207

etcd-host1

10.64.3.86

192.168.1.208

etcd-host2




01-组件版本和集群环境




集群组件和版本

  • Kubernetes 1.6.2
  • Docker 17.04.0-ce
  • Etcd 3.1.6
  • Flanneld 0.7.1 vxlan 网络
  • TLS 认证通信 (所有组件,如 etcd、kubernetes master 和 node)
  • RBAC 授权
  • kubelet TLS BootStrapping
  • kubedns、dashboard、heapster (influxdb、grafana)、EFK (elasticsearch、fluentd、kibana) 插件
  • 私有 docker registry,使用 ceph rgw 后端存储,TLS + HTTP Basic 认证

集群机器

  • 192.168.1.206

    master、registry

    192.168.1.207

    node01

    192.168.1.208

    node02

本着测试的目的,etcd 集群、kubernetes master 集群、kubernetesnode 均使用这三台机器。


初始化系统,关闭 firewalld selinux .

分发集群环境变量定义脚本


把全局变量定义脚本拷贝到所有机器的


/root/local/bin


目录


$ cp environment.sh /root/local/bin


$

vi /etc/profile

source /root/local/bin/environment.sh

:wq


192.168.1.206


environment.sh

#!/usr/bin/bash

BOOTSTRAP_TOKEN=”41f7e4ba8b7be874fcff18bf5cf41a7c”

# 最好使用 主机未用的网段 来定义服务网段和 Pod 网段

# 服务网段 (Service CIDR),部署前路由不可达,部署后集群内使用IP:Port可达

SERVICE_CIDR=”10.254.0.0/16″

# POD 网段 (Cluster CIDR),部署前路由不可达,**部署后**路由可达(flanneld保证)

CLUSTER_CIDR=”172.30.0.0/16″

# 服务端口范围 (NodePort Range)

export NODE_PORT_RANGE=”8400-9000″

# etcd 集群服务地址列表

export ETCD_ENDPOINTS=”https://192.168.1.206:2379,https://192.168.1.207:2379,https://192.168.1.208:2379″

# flanneld 网络配置前缀

export FLANNEL_ETCD_PREFIX=”/kubernetes/network”

# kubernetes 服务 IP (一般是 SERVICE_CIDR 中第一个IP)

export CLUSTER_KUBERNETES_SVC_IP=”10.254.0.1″

# 集群 DNS 服务 IP (从 SERVICE_CIDR 中预分配)

export CLUSTER_DNS_SVC_IP=”10.254.0.2″

# 集群 DNS 域名

export CLUSTER_DNS_DOMAIN=”cluster.local.”

export NODE_NAME=etcd-host0 # 当前部署的机器名称(随便定义,只要能区分不同机器即可)

export NODE_IP=192.168.1.206 # 当前部署的机器 IP

export NODE_IPS=”192.168.1.206 192.168.1.207 192.168.1.208″ # etcd 集群所有机器 IP

# etcd 集群间通信的IP和端口

export ETCD_NODES=etcd-host0=https://192.168.1.206:2380,etcd-host1=https://192.168.1.207:2380,etcd-host2=https://192.168.1.208:2380

# 导入用到的其它全局变量:ETCD_ENDPOINTS、FLANNEL_ETCD_PREFIX、CLUSTER_CIDR

export PATH=/root/local/bin:$PATH

export MASTER_IP=192.168.1.206 # 替换为 kubernetes maste 集群任一机器 IP

export KUBE_APISERVER=”https://${MASTER_IP}:6443″


192.168.1.207


environment.sh

#!/usr/bin/bash

BOOTSTRAP_TOKEN=”41f7e4ba8b7be874fcff18bf5cf41a7c”

# 最好使用 主机未用的网段 来定义服务网段和 Pod 网段

# 服务网段 (Service CIDR),部署前路由不可达,部署后集群内使用IP:Port可达

SERVICE_CIDR=”10.254.0.0/16″

# POD 网段 (Cluster CIDR),部署前路由不可达,**部署后**路由可达(flanneld保证)

CLUSTER_CIDR=”172.30.0.0/16″

# 服务端口范围 (NodePort Range)

export NODE_PORT_RANGE=”8400-9000″

# etcd 集群服务地址列表

export ETCD_ENDPOINTS=”https://192.168.1.206:2379,https://192.168.1.207:2379,https://192.168.1.208:2379″

# flanneld 网络配置前缀

export FLANNEL_ETCD_PREFIX=”/kubernetes/network”

# kubernetes 服务 IP (一般是 SERVICE_CIDR 中第一个IP)

export CLUSTER_KUBERNETES_SVC_IP=”10.254.0.1″

# 集群 DNS 服务 IP (从 SERVICE_CIDR 中预分配)

export CLUSTER_DNS_SVC_IP=”10.254.0.2″

# 集群 DNS 域名

export CLUSTER_DNS_DOMAIN=”cluster.local.”

export NODE_NAME=etcd-host1 # 当前部署的机器名称(随便定义,只要能区分不同机器即可)

export NODE_IP=192.168.1.207 # 当前部署的机器 IP

export NODE_IPS=”192.168.1.206 192.168.1.207 192.168.1.208″ # etcd 集群所有机器 IP

# etcd 集群间通信的IP和端口

export ETCD_NODES=etcd-host0=https://192.168.1.206:2380,etcd-host1=https://192.168.1.207:2380,etcd-host2=https://192.168.1.208:2380

# 导入用到的其它全局变量:ETCD_ENDPOINTS、FLANNEL_ETCD_PREFIX、CLUSTER_CIDR

export PATH=/root/local/bin:$PATH

export MASTER_IP=192.168.1.206 # 替换为 kubernetes maste 集群任一机器 IP

export KUBE_APISERVER=”https://${MASTER_IP}:6443″


192.168.1.208


environment.sh

#!/usr/bin/bash

BOOTSTRAP_TOKEN=”41f7e4ba8b7be874fcff18bf5cf41a7c”

# 最好使用 主机未用的网段 来定义服务网段和 Pod 网段

# 服务网段 (Service CIDR),部署前路由不可达,部署后集群内使用IP:Port可达

SERVICE_CIDR=”10.254.0.0/16″

# POD 网段 (Cluster CIDR),部署前路由不可达,**部署后**路由可达(flanneld保证)

CLUSTER_CIDR=”172.30.0.0/16″

# 服务端口范围 (NodePort Range)

export NODE_PORT_RANGE=”8400-9000″

# etcd 集群服务地址列表

export ETCD_ENDPOINTS=”https://192.168.1.206:2379,https://192.168.1.207:2379,https://192.168.1.208:2379″

# flanneld 网络配置前缀

export FLANNEL_ETCD_PREFIX=”/kubernetes/network”

# kubernetes 服务 IP (一般是 SERVICE_CIDR 中第一个IP)

export CLUSTER_KUBERNETES_SVC_IP=”10.254.0.1″

# 集群 DNS 服务 IP (从 SERVICE_CIDR 中预分配)

export CLUSTER_DNS_SVC_IP=”10.254.0.2″

# 集群 DNS 域名

export CLUSTER_DNS_DOMAIN=”cluster.local.”

export NODE_NAME=etcd-host2 # 当前部署的机器名称(随便定义,只要能区分不同机器即可)

export NODE_IP=192.168.1.208 # 当前部署的机器 IP

export NODE_IPS=”192.168.1.206 192.168.1.207 192.168.1.208″ # etcd 集群所有机器 IP

# etcd 集群间通信的IP和端口

export ETCD_NODES=etcd-host0=https://192.168.1.206:2380,etcd-host1=https://192.168.1.207:2380,etcd-host2=https://192.168.1.208:2380

# 导入用到的其它全局变量:ETCD_ENDPOINTS、FLANNEL_ETCD_PREFIX、CLUSTER_CIDR

export PATH=/root/local/bin:$PATH

export MASTER_IP=192.168.1.206 # 替换为 kubernetes maste 集群任一机器 IP

export KUBE_APISERVER=”https://${MASTER_IP}:6443″




02-创建CA证书和秘钥




创建 CA 证书和秘钥


kubernetes


系统各组件需要使用


TLS


证书对通信进行加密,本文档使用


CloudFlare


的 PKI工具集


cfssl


来生成Certificate Authority (CA) 证书和秘钥文件,CA 是自签名的证书,用来签名后续创建的其它 TLS 证书。

安装 CFSSL


$ wget



https://pkg.cfssl.org/R1.2/cfssl_linux-amd64




$ chmod +x cfssl_linux-amd64

$ sudo mv cfssl_linux-amd64 /root/local/bin/cfssl


$ wget



https://pkg.cfssl.org/R1.2/cfssljson_linux-amd64




$ chmod +x cfssljson_linux-amd64

$ sudo mv cfssljson_linux-amd64 /root/local/bin/cfssljson


$ wget



https://pkg.cfssl.org/R1.2/cfssl-certinfo_linux-amd64




$ chmod +x cfssl-certinfo_linux-amd64

$ sudo mv cfssl-certinfo_linux-amd64 /root/local/bin/cfssl-certinfo


$


export


PATH=/root/local/bin:$PATH

$ mkdir ssl

$


cd


ssl

$ cfssl print-defaults config


>


config.json

$ cfssl print-defaults csr


>


csr.json


以上工具


每个节点都要安装

创建 CA (Certificate Authority)

创建 CA 配置文件:


$ cat ca-config.json

{




“signing”


: {




“default”


: {




“expiry”


:


“8760h”



},



“profiles”


: {




“kubernetes”


: {




“usages”


: [



“signing”


,



“key encipherment”


,



“server auth”


,



“client auth”



],



“expiry”


:


“8760h”



}

}

}

}

  • ca-config.json:可以定义多个 profiles,分别指定不同的过期时间、使用场景等参数;后续在签名证书时使用某个 profile;
  • signing:表示该证书可用于签名其它证书;生成的 ca.pem 证书中 CA=TRUE;
  • server auth:表示 client 可以用该 CA 对 server 提供的证书进行验证;
  • client auth:表示 server 可以用该 CA 对 client 提供的证书进行验证;

创建 CA 证书签名请求:


$ cat ca-csr.json

{




“CN”


:


“kubernetes”


,



“key”


: {




“algo”


:


“rsa”


,



“size”


: 2048

},



“names”


: [

{




“C”


:


“CN”


,



“ST”


:


“BeiJing”


,



“L”


:


“BeiJing”


,



“O”


:


“k8s”


,



“OU”


:


“System”



}

]

}

  • “CN”:Common Name,kube-apiserver 从证书中提取该字段作为请求的用户名 (User Name);浏览器使用该字段验证网站是否合法;
  • “O”:Organization,kube-apiserver 从证书中提取该字段作为请求用户所属的组 (Group);

生成 CA 证书和私钥:


$ cfssl gencert -initca ca-csr.json


|


cfssljson -bare ca

$ ls ca


*



ca-config.json  ca.csr  ca-csr.json ca-key.pem  ca.pem

$

分发证书


将生成的 CA 证书、秘钥文件、配置文件拷贝到所有机器的


/etc/kubernetes/ssl


目录下


$ sudo mkdir -p /etc/kubernetes/ssl

$ sudo cp ca


*


/etc/kubernetes/ssl

$

校验证书(这个是个例子)

以校验 kubernetes 证书(后续部署 master 节点时生成的)为例:

使用 openssl 命令


$ openssl x509  -noout -text -in  kubernetes.pem



Signature Algorithm:sha256WithRSAEncryption

Issuer: C=CN, ST=BeiJing,L=BeiJing, O=k8s, OU=System, CN=Kubernetes

Validity

Not Before: Apr  5 05:36:00 2017 GMT

Not After


:


Apr  5 05:36:00 2018GMT

Subject: C=CN, ST=BeiJing,L=BeiJing, O=k8s, OU=System, CN=kubernetes



X509v3 extensions:

X509v3 Key Usage:critical

Digital Signature, KeyEncipherment

X509v3 Extended KeyUsage:

TLS Web ServerAuthentication, TLS Web Client Authentication

X509v3 Basic Constraints:critical

CA:FALSE

X509v3 Subject KeyIdentifier:

DD:52:04:43:10:13:A9:29:24:17:3A:0E:D7:14:DB:36:F8:6C:E0:E0

X509v3 Authority KeyIdentifier:

keyid:44:04:3B:60:BD:69:78:14:68:AF:A0:41:13:F6:17:07:13:63:58:CD


X509v3 Subject Alternative Name:

DNS:kubernetes,DNS:kubernetes.default, DNS:kubernetes.default.svc,DNS:kubernetes.default.svc.cluster, DNS:kubernetes.default.svc.cluster.local,IP Address:127.0.0.1, IP Address:10.64.3.7, IP Address:10.254.0.1


  • 确认 Issuer 字段的内容和 ca-csr.json 一致;
  • 确认 Subject 字段的内容和 kubernetes-csr.json 一致;
  • 确认 X509v3 Subject Alternative Name 字段的内容和 kubernetes-csr.json 一致;
  • 确认 X509v3 Key Usage、Extended Key Usage 字段的内容和 ca-config.json 中 kubernetes profile 一致;

使用 cfssl-certinfo 命令


$ cfssl-certinfo -certkubernetes.pem



{




“subject”


:{




“common_name”


:


“kubernetes”


,



“country”


:


“CN”


,



“organization”


:


“k8s”


,



“organizational_unit”


:


“System”


,



“locality”


:


“BeiJing”


,



“province”


:


“BeiJing”


,



“names”


:[



“CN”


,



“BeiJing”


,



“BeiJing”


,



“k8s”


,



“System”


,



“kubernetes”



]

},



“issuer”


:{




“common_name”


:


“Kubernetes”


,



“country”


:


“CN”


,



“organization”


:


“k8s”


,



“organizational_unit”


:


“System”


,



“locality”


:


“BeiJing”


,



“province”


:


“BeiJing”


,



“names”


:[



“CN”


,



“BeiJing”


,



“BeiJing”


,



“k8s”


,



“System”


,



“Kubernetes”



]

},



“serial_number”


:


“174360492872423263473151971632292895707129022309”


,



“sans”


:[



“kubernetes”


,



“kubernetes.default”


,



“kubernetes.default.svc”


,



“kubernetes.default.svc.cluster”


,



“kubernetes.default.svc.cluster.local”


,



“127.0.0.1”


,






192.168.1.206





,






192.168.1.207





,






192.168.1.208





,



“10.254.0.1”



],



“not_before”


:


“2017-04-05T05:36:00Z”


,



“not_after”


:


“2018-04-05T05:36:00Z”


,



“sigalg”


:


“SHA256WithRSA”


,





03-部署高可用Etcd集群




部署高可用 etcd 集群


kuberntes 系统使用 etcd存储所有数据,本文档介绍部署一个三节点高可用 etcd 集群的步骤,这三个节点复用 kubernetes master 机器,分别命名为


etcd-host0





etcd-host1





etcd-host2




  • etcd-host0:


    192.168.1.206

  • etcd-host1:


    192.168.1.207

  • etcd-host2:


    192.168.1.208

使用的变量


本文档用到的变量定义如下


(一开始已经加入


environment.sh




脚本里了







$


export


NODE_NAME=etcd-host0


#当前部署的机器名称(随便定义,只要能区分不同机器即可)



$


export


NODE_IP=


192.168.1.206




# 当前部署的机器 IP



$


export


NODE_IPS=





192.168.1.206 192.168.1.207 192.168.1.208





# etcd 集群所有机器 IP



$


# etcd 集群间通信的IP和端口



$


export


ETCD_NODES=etcd-host0=https://


192.168.1.206


:2380,etcd-host1=https://


192.168.1.207


:2380,etcd-host2=https://


192.168.1.208


:2380

$


#导入用到的其它全局变量:ETCD_ENDPOINTS、FLANNEL_ETCD_PREFIX、CLUSTER_CIDR



$


source


/root/local/bin/environment.sh

$

下载二进制文件






https://github.com/coreos/etcd/releases



页面下载最新版本的二进制文件:


$ wget



https://github.com/coreos/etcd/releases/download/v3.1.6/etcd-v3.1.6-linux-amd64.tar.gz




$ tar -xvf etcd-v3.1.6-linux-amd64.tar.gz

$ sudo mv etcd-v3.1.6-linux-amd64/etcd


*


/root/local/bin

$

创建 TLS 秘钥和证书

为了保证通信安全,客户端(如 etcdctl) 与 etcd 集群、etcd集群之间的通信需要使用 TLS 加密,本节创建 etcd TLS 加密所需的证书和私钥。

创建 etcd 证书签名请求:


$ cat


>


etcd-csr.json


<<EOF





{






“CN”: “etcd”,





“hosts”: [





“127.0.0.1”,





“${NODE_IP}”





],





“key”: {






“algo”: “rsa”,





“size”: 2048





},





“names”: [





{






“C”: “CN”,





“ST”: “BeiJing”,





“L”: “BeiJing”,





“O”: “k8s”,





“OU”: “System”





}





]





}





EOF


  • hosts 字段指定授权使用该证书的 etcd 节点 IP;

生成 etcd 证书和私钥:


$ cfssl gencert-ca=/etc/kubernetes/ssl/ca.pem \

-ca-key=/etc/kubernetes/ssl/ca-key.pem\

-config=/etc/kubernetes/ssl/ca-config.json \

-profile=kubernetes etcd-csr.json


|


cfssljson -bare etcd

$ ls etcd


*



etcd.csr  etcd-csr.json  etcd-key.pem etcd.pem

$ sudo mkdir -p /etc/etcd/ssl

$ sudo mv etcd


*


.pem /etc/etcd/ssl

$ rm etcd.csr  etcd-csr.json

创建 etcd 的 systemd unit 文件


$ sudo mkdir -p /var/lib/etcd


# 必须先创建工作目录



$ cat


>


etcd.service


<<EOF





[Unit]





Description=Etcd Server





After=network.target





After=network-online.target





Wants=network-online.target





Documentation=https://github.com/coreos


[Service]





Type=notify





WorkingDirectory=/var/lib/etcd/





ExecStart=/root/local/bin/etcd\\





–name=${NODE_NAME} \\





–cert-file=/etc/etcd/ssl/etcd.pem \\





–key-file=/etc/etcd/ssl/etcd-key.pem \\





–peer-cert-file=/etc/etcd/ssl/etcd.pem \\





–peer-key-file=/etc/etcd/ssl/etcd-key.pem \\





–trusted-ca-file=/etc/kubernetes/ssl/ca.pem\\





–peer-trusted-ca-file=/etc/kubernetes/ssl/ca.pem \\





–initial-advertise-peer-urls=https://${NODE_IP}:2380 \\





–listen-peer-urls=https://${NODE_IP}:2380 \\





–listen-client-urls=https://${NODE_IP}:2379,http://127.0.0.1:2379 \\





–advertise-client-urls=https://${NODE_IP}:2379 \\





–initial-cluster-token=etcd-cluster-0 \\





–initial-cluster=${ETCD_NODES} \\





–initial-cluster-state=new \\





–data-dir=/var/lib/etcd





Restart=on-failure





RestartSec=5





LimitNOFILE=65536


[Install]





WantedBy=multi-user.target





EOF


  • 指定


    etcd


    的工作目录和数据目录为


    /var/lib/etcd


    ,需在启动服务前创建这个目录;

  • 为了保证通信安全,需要指定 etcd 的公私钥(cert-file和key-file)、Peers 通信的公私钥和 CA 证书(peer-cert-file、peer-key-file、peer-trusted-ca-file)、客户端的CA证书(trusted-ca-file);

  • –initial-cluster-state


    值为


    new


    时,


    –name


    的参数值必须位于


    –initial-cluster


    列表中;


完整 unit 文件见:


etcd.service

启动 etcd 服务


$ sudo mv etcd.service/etc/systemd/system/

$ sudo systemctl daemon-reload

$ sudo systemctl


enable


etcd

$ sudo systemctl start etcd

$ systemctl status etcd

$

最先启动的 etcd 进程会卡住一段时间,等待其它节点上的 etcd进程加入集群,为正常现象。

在所有的 etcd 节点重复上面的步骤,直到所有机器的 etcd 服务都已启动。

验证服务

部署完 etcd 集群后,在任一 etcd 集群节点上执行如下命令:


$


for


ip


in


${NODE_IPS}


; do



ETCDCTL_API=3 /root/local/bin/etcdctl\

–endpoints=https://${ip}:2379  \

–cacert=/etc/kubernetes/ssl/ca.pem\

–cert=/etc/etcd/ssl/etcd.pem\

–key=/etc/etcd/ssl/etcd-key.pem\

endpoint health


; done

预期结果:

2017-07-0517:11:58.103401 I | warning: ignoring ServerName for user-provided CA forbackwards compatibility is deprecated


https://192.168.1.206:2379

is healthy:successfully committed proposal: took = 81.247077ms

2017-07-0517:11:58.356539 I | warning: ignoring ServerName for user-provided CA forbackwards compatibility is deprecated


https://192.168.1.207:2379

is healthy:successfully committed proposal: took = 12.073555ms

2017-07-0517:11:58.523829 I | warning: ignoring ServerName for user-provided CA forbackwards compatibility is deprecated


https://192.168.1.208:2379

is healthy:successfully committed proposal: took = 5.413361ms



04-部署Kubectl命令行工具



部署 kubectl 命令行工具


kubectl 默认从


~/.kube/config


配置文件获取访问kube-apiserver 地址、证书、用户名等信息,如果没有配置该文件,执行命令时出错:


$ kubectl get pods

The connection to the server localhost:8080 was refused – did you specify theright host or port


?

本文档介绍下载和配置 kubernetes 集群命令行工具 kubectl 的步骤。


需要将下载的 kubectl二进制程序和生成的


~/.kube/config


配置文件拷贝到所有使用 kubectl命令的机器。

使用的变量


本文档用到的变量定义如下


(一开始已经加入


environment.sh




脚本里了







$


export


MASTER_IP=


192.168.1.206




#


在主节点


206


上操作



$


export


KUBE_APISERVER=


“https://


${MASTER_IP}


:6443″



$


  • 变量 KUBE_APISERVER 指定 kubelet 访问的 kube-apiserver 的地址,后续被写入


    ~/.kube/config


    配置文件;

下载 kubectl


$ wget



https://dl.k8s.io/v1.6.2/kubernetes-client-linux-amd64.tar.gz




$ tar -xzvf kubernetes-client-linux-amd64.tar.gz

$ sudo cp kubernetes/client/bin/kube


*


/root/local/bin/

$ chmod a+x /root/local/bin/kube


*



$


export


PATH=/root/local/bin:$PATH

$

创建 admin 证书

kubectl 与 kube-apiserver 的安全端口通信,需要为安全通信提供 TLS证书和秘钥。

创建 admin 证书签名请求


$ cat admin-csr.json

{




“CN”


:


“admin”


,



“hosts”


: [],



“key”


: {




“algo”


:


“rsa”


,



“size”


: 2048

},



“names”


: [

{




“C”


:


“CN”


,



“ST”


:


“BeiJing”


,



“L”


:


“BeiJing”


,



“O”


:


“system:masters”


,



“OU”


:


“System”



}

]

}


  • 后续


    kube-apiserver


    使用


    RBAC


    对客户端(如


    kubelet





    kube-proxy





    Pod


    )请求进行授权;

  • kube-apiserver


    预定义了一些


    RBAC


    使用的


    RoleBindings


    ,如


    cluster-admin


    将 Group


    system:masters


    与 Role


    cluster-admin


    绑定,该 Role 授予了调用


    kube-apiserver


    所有 API的权限;

  • O 指定该证书的 Group 为


    system:masters





    kubelet


    使用该证书访问


    kube-apiserver


    时 ,由于证书被 CA 签名,所以认证通过,同时由于证书用户组为经过预授权的


    system:masters


    ,所以被授予访问所有 API 的权限;

  • hosts 属性值为空列表;

生成 admin 证书和私钥:


$ cfssl gencert-ca=/etc/kubernetes/ssl/ca.pem \

-ca-key=/etc/kubernetes/ssl/ca-key.pem\

-config=/etc/kubernetes/ssl/ca-config.json \

-profile=kubernetes admin-csr.json


|


cfssljson -bare admin

$ ls admin


*



admin.csr  admin-csr.json  admin-key.pem admin.pem

$ sudo mv admin


*


.pem /etc/kubernetes/ssl/

$ rm admin.csr admin-csr.json

$

创建 kubectl kubeconfig 文件


$


# 设置集群参数



$ kubectl config set-cluster kubernetes \

–certificate-authority=/etc/kubernetes/ssl/ca.pem \

–embed-certs=true \

–server=${KUBE_APISERVER}

$


# 设置客户端认证参数



$ kubectl config set-credentials admin \

–client-certificate=/etc/kubernetes/ssl/admin.pem \

–embed-certs=true \

–client-key=/etc/kubernetes/ssl/admin-key.pem

$


# 设置上下文参数



$ kubectl config set-context kubernetes \

–cluster=kubernetes \

–user=admin

$


# 设置默认上下文



$ kubectl config use-context kubernetes


  • admin.pem


    证书 O 字段值为


    system:masters





    kube-apiserver


    预定义的 RoleBinding


    cluster-admin


    将 Group


    system:masters


    与 Role


    cluster-admin


    绑定,该 Role 授予了调用


    kube-apiserver


    相关 API 的权限;

  • 生成的 kubeconfig 被保存到


    ~/.kube/config


    文件;

分发 kubeconfig 文件





~/.kube/config


文件拷贝到运行


kubelet


命令的机器的


~/.kube/


目录下。






05-部署Flannel网络




部署 Flannel 网络

kubernetes 要求集群内各节点能通过 Pod 网段互联互通,本文档介绍使用Flannel 在所有节点 (Master、Node) 上创建互联互通的 Pod 网段的步骤。

使用的变量

本文档用到的变量定义如下:


$


export


NODE_IP=


192.168.1.206




# 当前部署节点的 IP



$


# 导入用到的其它全局变量:ETCD_ENDPOINTS、FLANNEL_ETCD_PREFIX、CLUSTER_CIDR



$


source


/root/local/bin/environment.sh

$

创建 TLS 秘钥和证书

etcd 集群启用了双向 TLS 认证,所以需要为 flanneld 指定与 etcd集群通信的 CA 和秘钥。

创建 flanneld 证书签名请求:


$ cat


>


flanneld-csr.json


<<EOF





{






“CN”: “flanneld”,





“hosts”: [],





“key”: {






“algo”: “rsa”,





“size”: 2048





},





“names”: [





{






“C”: “CN”,





“ST”: “BeiJing”,





“L”: “BeiJing”,





“O”: “k8s”,





“OU”: “System”





}





]





}





EOF


  • hosts 字段为空;

生成 flanneld 证书和私钥:


$ cfssl gencert-ca=/etc/kubernetes/ssl/ca.pem \

-ca-key=/etc/kubernetes/ssl/ca-key.pem\

-config=/etc/kubernetes/ssl/ca-config.json \

-profile=kubernetes flanneld-csr.json


|


cfssljson -bare flanneld

$ ls flanneld


*



flanneld.csr  flanneld-csr.json  flanneld-key.pem flanneld.pem

$ sudo mkdir -p /etc/flanneld/ssl

$ sudo mv flanneld


*


.pem /etc/flanneld/ssl

$ rm flanneld.csr  flanneld-csr.json

向 etcd 写入集群 Pod 网段信息

注意:本步骤只需在第一次部署 Flannel 网络时执行,后续在其它节点上部署 Flannel时无需再写入该信息!


$ /root/local/bin/etcdctl \

–endpoints=${ETCD_ENDPOINTS}\

–ca-file=/etc/kubernetes/ssl/ca.pem\

–cert-file=/etc/flanneld/ssl/flanneld.pem\

–key-file=/etc/flanneld/ssl/flanneld-key.pem \



set


${FLANNEL_ETCD_PREFIX}/config


‘{“Network”:”‘


${CLUSTER_CIDR}


‘”, “SubnetLen”: 24, “Backend”:{“Type”: “vxlan”}}’


  • flanneld 目前版本 (v0.7.1) 不支持 etcd v3,故使用 etcd v2 API 写入配置 key 和网段数据;

  • 写入的 Pod 网段(${CLUSTER_CIDR},172.30.0.0/16) 必须与 kube-controller-manager 的


    –cluster-cidr


    选项值一致;

安装和配置 flanneld

下载 flanneld


$ mkdir flannel

$ wget



https://github.com/coreos/flannel/releases/download/v0.7.1/flannel-v0.7.1-linux-amd64.tar.gz




$ tar -xzvf flannel-v0.7.1-linux-amd64.tar.gz -C flannel

$ sudo cp flannel/{flanneld,mk-docker-opts.sh} /root/local/bin

$

创建 flanneld 的 systemd unit 文件


$ cat


>


flanneld.service


<<


EOF

[Unit]

Description=Flanneld overlay address etcdagent

After=network.target

After=network-online.target

Wants=network-online.target

After=etcd.service

Before=docker.service


[Service]

Type=notify

ExecStart=/root/local/bin/flanneld \\

-etcd-cafile=/etc/kubernetes/ssl/ca.pem\\

-etcd-certfile=/etc/flanneld/ssl/flanneld.pem \\

-etcd-keyfile=/etc/flanneld/ssl/flanneld-key.pem \\

-etcd-endpoints=${ETCD_ENDPOINTS}\\

-etcd-prefix=${FLANNEL_ETCD_PREFIX}

ExecStartPost=/root/local/bin/mk-docker-opts.sh -k DOCKER_NETWORK_OPTIONS -d/run/flannel/docker

Restart=on-failure


[Install]

WantedBy=multi-user.target

RequiredBy=docker.service

EOF


  • mk-docker-opts.sh 脚本将分配给 flanneld 的 Pod 子网网段信息写入到


    /run/flannel/docker


    文件中,后续 docker 启动时使用这个文件中参数值设置 docker0 网桥;

  • flanneld 使用系统缺省路由所在的接口和其它节点通信,对于有多个网络接口的机器(如,内网和公网),可以用


    -iface


    选项值指定通信接口(上面的 systemd unit 文件没指定这个选项);


完整 unit 见


flanneld.service

启动 flanneld


$ sudo cp flanneld.service/etc/systemd/system/

$ sudo systemctl daemon-reload

$ sudo systemctl


enable


flanneld

$ sudo systemctl start flanneld

$ systemctl status flanneld

$

检查 flanneld 服务


$ journalctl  -u flanneld


|


grep


‘Lease acquired’



$ ifconfig flannel.1

$

检查分配给各 flanneld 的 Pod 网段信息


$


# 查看集群 Pod 网段(/16)



$ /root/local/bin/etcdctl \

–endpoints=${ETCD_ENDPOINTS}\

–ca-file=/etc/kubernetes/ssl/ca.pem\

–cert-file=/etc/flanneld/ssl/flanneld.pem\

–key-file=/etc/flanneld/ssl/flanneld-key.pem \

get${FLANNEL_ETCD_PREFIX}/config

{


“Network”


:


“172.30.0.0/16”


,


“SubnetLen”


: 24,


“Backend”


: {



“Type”


:


“vxlan”


} }

$


# 查看已分配的 Pod 子网段列表(/24)



$ /root/local/bin/etcdctl \

–endpoints=${ETCD_ENDPOINTS}\

–ca-file=/etc/kubernetes/ssl/ca.pem\

–cert-file=/etc/flanneld/ssl/flanneld.pem\

–key-file=/etc/flanneld/ssl/flanneld-key.pem \

ls ${FLANNEL_ETCD_PREFIX}/subnets

2017-07-0517:27:46.007743 I | warning: ignoring ServerName for user-provided CA forbackwards compatibility is deprecated

/kubernetes/network/subnets/172.30.43.0-24

/kubernetes/network/subnets/172.30.44.0-24

/kubernetes/network/subnets/172.30.45.0-24



$


# 查看某一 Pod 网段对应的 flanneld 进程监听的 IP 和网络参数



$ /root/local/bin/etcdctl \

–endpoints=${ETCD_ENDPOINTS}\

–ca-file=/etc/kubernetes/ssl/ca.pem\

–cert-file=/etc/flanneld/ssl/flanneld.pem\

–key-file=/etc/flanneld/ssl/flanneld-key.pem \

get${FLANNEL_ETCD_PREFIX}/subnets/172.30.


43


.0-24

2017-07-0517:28:34.116874 I | warning: ignoring ServerName for user-provided CA forbackwards compatibility is deprecated

{“PublicIP”:”192.168.1.207″,”BackendType”:”vxlan”,”BackendData”:{“VtepMAC”:”52:73:8c:2f:ae:3c”}}

确保各节点间 Pod 网段能互联互通

在各节点上部署完 Flannel 后,查看已分配的 Pod 子网段列表(/24)


$ /root/local/bin/etcdctl\

–endpoints=${ETCD_ENDPOINTS}\

–ca-file=/etc/kubernetes/ssl/ca.pem\

–cert-file=/etc/flanneld/ssl/flanneld.pem\

–key-file=/etc/flanneld/ssl/flanneld-key.pem \

ls${FLANNEL_ETCD_PREFIX}/subnets

/kubernetes/network/subnets/172.30.


43


.0-24

/kubernetes/network/subnets/172.30.


44


.0-24

/kubernetes/network/subnets/172.30.


45


.0-24


当前三个节点分配的 Pod 网段分别是:172.30.


43


.0-24、172.30.


44


.0-24、172.30.


45


.0-24。




06-部署Master节点




部署 master 节点

kubernetes master 节点包含的组件:


  • kube-apiserver

  • kube-scheduler

  • kube-controller-manager

目前这三个组件需要部署在同一台机器上:


  • kube-scheduler





    kube-controller-manager





    kube-apiserver


    三者的功能紧密相关;

  • 同时只能有一个


    kube-scheduler





    kube-controller-manager


    进程处于工作状态,如果运行多个,则需要通过选举产生一个 leader;

本文档介绍部署单机 kubernetes master 节点的步骤,没有实现高可用master 集群。

计划后续再介绍部署 LB 的步骤,客户端(kubectl、kubelet、kube-proxy) 使用 LB 的 VIP 来访问 kube-apiserver,从而实现高可用 master 集群。

master 节点与 node 节点上的 Pods 通过 Pod 网络通信,所以需要在master 节点上部署 Flannel 网络。

使用的变量

本文档用到的变量定义如下:


$


export


MASTER_IP=


192.168.1.206




# 替换为当前部署的 master 机器 IP



$


#导入用到的其它全局变量:SERVICE_CIDR、CLUSTER_CIDR、NODE_PORT_RANGE、ETCD_ENDPOINTS、BOOTSTRAP_TOKEN



$


source


/root/local/bin/environment.sh

$

下载最新版本的二进制文件

有两种下载方式:






  1. github release 页面



    下载发布版 tarball,解压后再执行下载脚本



    $ wget



    https://github.com/kubernetes/kubernetes/releases/download/v1.6.2/kubernetes.tar.gz




    $ tar -xzvf kubernetes.tar.gz



    $


    cd


    kubernetes

    $ ./cluster/get-kube-binaries.sh







  2. CHANGELOG


    页面



    下载


    client





    server


    tarball 文件



    server


    的 tarball


    kubernetes-server-linux-amd64.tar.gz


    已经包含了


    client


    (


    kubectl


    ) 二进制文件,所以不用单独下载


    kubernetes-client-linux-amd64.tar.gz


    文件;



    $


    # wget



    https://dl.k8s.io/v1.6.2/kubernetes-client-linux-amd64.tar.gz




    $ wget



    https://dl.k8s.io/v1.6.2/kubernetes-server-linux-amd64.tar.gz




    $ tar -xzvf kubernetes-server-linux-amd64.tar.gz



    $


    cd


    kubernetes

    $ tar -xzvf  kubernetes-src.tar.gz

将二进制文件拷贝到指定路径:


$ sudo cp -rserver/bin/{kube-apiserver,kube-controller-manager,kube-scheduler,kubectl,kube-proxy,kubelet}/root/local/bin/

$

安装和配置 flanneld


参考


05-部署Flannel网络.md

创建 kubernetes 证书

创建 kubernetes 证书签名请求


$ cat


>


kubernetes-csr.json


<<EOF





{






“CN”: “kubernetes”,





“hosts”: [





“127.0.0.1”,





“${MASTER_IP}”,





“${CLUSTER_KUBERNETES_SVC_IP}”,





“kubernetes”,





“kubernetes.default”,





“kubernetes.default.svc”,





“kubernetes.default.svc.cluster”,





“kubernetes.default.svc.cluster.local”





],





“key”: {






“algo”: “rsa”,





“size”: 2048





},





“names”: [





{






“C”: “CN”,





“ST”: “BeiJing”,





“L”: “BeiJing”,





“O”: “k8s”,





“OU”: “System”





}





]





}





EOF


  • 如果 hosts 字段不为空则需要指定授权使用该证书的 IP 或域名列表,所以上面分别指定了当前部署的 master 节点主机 IP;

  • 还需要添加 kube-apiserver 注册的名为


    kubernetes


    的服务 IP (Service Cluster IP),一般是 kube-apiserver


    –service-cluster-ip-range


    选项值指定的网段的第一个IP,如 “10.254.0.1”;



    $ kubectl get svc kubernetes

    NAME         CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE

    kubernetes   10.254.0.1


    <


    none


    >


    443/TCP   1d

生成 kubernetes 证书和私钥


$ cfssl gencert-ca=/etc/kubernetes/ssl/ca.pem \

-ca-key=/etc/kubernetes/ssl/ca-key.pem\

-config=/etc/kubernetes/ssl/ca-config.json \

-profile=kubernetes kubernetes-csr.json


|


cfssljson -bare kubernetes

$ ls kubernetes


*



kubernetes.csr  kubernetes-csr.json  kubernetes-key.pem  kubernetes.pem

$ sudo mkdir -p /etc/kubernetes/ssl/

$ sudo mv kubernetes


*


.pem /etc/kubernetes/ssl/

$ rm kubernetes.csr  kubernetes-csr.json

配置和启动 kube-apiserver

创建 kube-apiserver 使用的客户端 token 文件


kubelet 首次启动时向 kube-apiserver发送 TLS Bootstrapping 请求,kube-apiserver 验证 kubelet 请求中的 token 是否与它配置的 token.csv一致,如果一致则自动为 kubelet生成证书和秘钥。


(这个


token


只要


master


做一次)


$


# 导入的 environment.sh 文件定义了 BOOTSTRAP_TOKEN 变量



$ cat


>


token.csv


<<EOF





${BOOTSTRAP_TOKEN},kubelet-bootstrap,10001,”system:kubelet-bootstrap”





EOF



$ mv token.csv /etc/kubernetes/

$

创建 kube-apiserver 的 systemd unit 文件


$ cat


>


kube-apiserver.service


<<EOF





[Unit]





Description=KubernetesAPI Server





Documentation=https://github.com/GoogleCloudPlatform/kubernetes





After=network.target


[Service]





ExecStart=/root/local/bin/kube-apiserver\\





–admission-control=NamespaceLifecycle,LimitRanger,ServiceAccount,DefaultStorageClass,ResourceQuota\\





–advertise-address=${MASTER_IP} \\





–bind-address=${MASTER_IP} \\





–insecure-bind-address=${MASTER_IP} \\





–authorization-mode=RBAC \\





–runtime-config=rbac.authorization.k8s.io/v1alpha1 \\





–kubelet-https=true \\





–experimental-bootstrap-token-auth \\





–token-auth-file=/etc/kubernetes/token.csv\\





–service-cluster-ip-range=${SERVICE_CIDR} \\





–service-node-port-range=${NODE_PORT_RANGE}\\





–tls-cert-file=/etc/kubernetes/ssl/kubernetes.pem \\





–tls-private-key-file=/etc/kubernetes/ssl/kubernetes-key.pem \\





–client-ca-file=/etc/kubernetes/ssl/ca.pem\\





–service-account-key-file=/etc/kubernetes/ssl/ca-key.pem \\





–etcd-cafile=/etc/kubernetes/ssl/ca.pem \\





–etcd-certfile=/etc/kubernetes/ssl/kubernetes.pem \\





–etcd-keyfile=/etc/kubernetes/ssl/kubernetes-key.pem \\





–etcd-servers=${ETCD_ENDPOINTS} \\





–enable-swagger-ui=true \\





–allow-privileged=true \\





–apiserver-count=3 \\





–audit-log-maxage=30 \\





–audit-log-maxbackup=3 \\





–audit-log-maxsize=100 \\





–audit-log-path=/var/lib/audit.log \\





–event-ttl=1h \\





–v=2





Restart=on-failure





RestartSec=5





Type=notify





LimitNOFILE=65536


[Install]





WantedBy=multi-user.target





EOF


  • kube-apiserver 1.6 版本开始使用 etcd v3 API 和存储格式;

  • –authorization-mode=RBAC


    指定在安全端口使用 RBAC 授权模式,拒绝未通过授权的请求;

  • kube-scheduler、kube-controller-manager 一般和 kube-apiserver 部署在同一台机器上,它们使用非安全端口和 kube-apiserver通信;

  • kubelet、kube-proxy、kubectl 部署在其它 Node 节点上,如果通过安全端口访问 kube-apiserver,则必须先通过 TLS 证书认证,再通过 RBAC 授权;

  • kube-proxy、kubectl 通过在使用的证书里指定相关的 User、Group 来达到通过 RBAC 授权的目的;

  • 如果使用了 kubelet TLS Boostrap 机制,则不能再指定


    –kubelet-certificate-authority





    –kubelet-client-certificate





    –kubelet-client-key


    选项,否则后续 kube-apiserver 校验 kubelet 证书时出现 ”x509: certificate signed by unknown authority“ 错误;

  • –admission-control


    值必须包含


    ServiceAccount


    ,否则部署集群插件时会失败;

  • –bind-address


    不能为


    127.0.0.1




  • –service-cluster-ip-range


    指定 Service Cluster IP 地址段,该地址段不能路由可达;

  • –service-node-port-range=${NODE_PORT_RANGE}


    指定 NodePort 的端口范围;

  • 缺省情况下 kubernetes 对象保存在 etcd


    /registry


    路径下,可以通过


    –etcd-prefix


    参数进行调整;


完整 unit 见


kube-apiserver.service

启动 kube-apiserver


$ sudo cp kube-apiserver.service/etc/systemd/system/

$ sudo systemctl daemon-reload

$ sudo systemctl


enable


kube-apiserver

$ sudo systemctl start kube-apiserver

$ sudo systemctl status kube-apiserver

$

配置和启动 kube-controller-manager

创建 kube-controller-manager 的 systemd unit 文件


$ cat


>


kube-controller-manager.service


<<EOF





[Unit]





Description=KubernetesController Manager





Documentation=https://github.com/GoogleCloudPlatform/kubernetes


[Service]





ExecStart=/root/local/bin/kube-controller-manager\\





–address=127.0.0.1 \\





–master=http://${MASTER_IP}:8080 \\





–allocate-node-cidrs=true \\





–service-cluster-ip-range=${SERVICE_CIDR} \\





–cluster-cidr=${CLUSTER_CIDR} \\





–cluster-name=kubernetes \\





–cluster-signing-cert-file=/etc/kubernetes/ssl/ca.pem \\





–cluster-signing-key-file=/etc/kubernetes/ssl/ca-key.pem \\





–service-account-private-key-file=/etc/kubernetes/ssl/ca-key.pem \\





–root-ca-file=/etc/kubernetes/ssl/ca.pem \\





–leader-elect=true \\





–v=2





Restart=on-failure





RestartSec=5


[Install]





WantedBy=multi-user.target





EOF


  • –address


    值必须为


    127.0.0.1


    ,因为当前 kube-apiserver 期望 scheduler 和 controller-manager 在同一台机器,否则:



    $ kubectl get componentstatuses

    NAME                 STATUS      MESSAGE                                                                                        ERROR

    controller-manager   Unhealthy   Get



    http://127.0.0.1:10252/healthz



    : dial tcp 127.0.0.1:10252: getsockopt: connection refused

    scheduler            Unhealthy   Get



    http://127.0.0.1:10251/healthz



    : dial tcp 127.0.0.1:10251: getsockopt: connection refused



    参考:



    https://github.com/kubernetes-incubator/bootkube/issues/64


  • –master=http://{MASTER_IP}:8080


    :使用非安全 8080 端口与 kube-apiserver 通信;

  • –cluster-cidr


    指定 Cluster 中 Pod 的 CIDR 范围,该网段在各 Node 间必须路由可达(flanneld保证);

  • –service-cluster-ip-range


    参数指定 Cluster 中 Service 的CIDR范围,该网络在各 Node 间必须路由不可达,必须和 kube-apiserver 中的参数一致;

  • –cluster-signing-*


    指定的证书和私钥文件用来签名为 TLS BootStrap 创建的证书和私钥;

  • –root-ca-file


    用来对 kube-apiserver 证书进行校验,指定该参数后,才会在Pod 容器的 ServiceAccount 中放置该 CA 证书文件;

  • –leader-elect=true


    部署多台机器组成的 master 集群时选举产生一处于工作状态的


    kube-controller-manager


    进程;


完整 unit 见


kube-controller-manager.service

启动 kube-controller-manager


$ sudo cp kube-controller-manager.service /etc/systemd/system/

$ sudo systemctl daemon-reload

$ sudo systemctl


enable


kube-controller-manager

$ sudo systemctl start kube-controller-manager

$

配置和启动 kube-scheduler

创建 kube-scheduler 的 systemd unit 文件


$ cat


>


kube-scheduler.service


<<EOF





[Unit]





Description=KubernetesScheduler





Documentation=https://github.com/GoogleCloudPlatform/kubernetes


[Service]





ExecStart=/root/local/bin/kube-scheduler\\





–address=127.0.0.1 \\





–master=http://${MASTER_IP}:8080 \\





–leader-elect=true \\





–v=2





Restart=on-failure





RestartSec=5


[Install]





WantedBy=multi-user.target





EOF


  • –address


    值必须为


    127.0.0.1


    ,因为当前 kube-apiserver 期望 scheduler 和 controller-manager 在同一台机器;

  • –master=http://{MASTER_IP}:8080


    :使用非安全 8080 端口与 kube-apiserver 通信;

  • –leader-elect=true


    部署多台机器组成的 master 集群时选举产生一处于工作状态的


    kube-controller-manager


    进程;


完整 unit 见


kube-scheduler.service



启动 kube-scheduler


$ sudo cp kube-scheduler.service/etc/systemd/system/

$ sudo systemctl daemon-reload

$ sudo systemctl


enable


kube-scheduler

$ sudo systemctl start kube-scheduler

$

验证 master 节点功能


$ kubectl getcomponentstatuses

NAME                 STATUS    MESSAGE              ERROR

controller-manager   Healthy   ok

scheduler            Healthy   ok

etcd-0               Healthy   {



“health”


:


“true”


}

etcd-1               Healthy   {



“health”


:


“true”


}

etcd-2               Healthy   {



“health”


:


“true”


}




07-部署Node节点




部署 Node 节点

kubernetes Node 节点包含如下组件:

  • flanneld
  • docker
  • kubelet
  • kube-proxy

使用的变量

本文档用到的变量定义如下:


$


# 替换为 kubernetesmaster 集群任一机器 IP



$


export


MASTER_IP=1


92.168.1.206



$


export


KUBE_APISERVER=


“https://


${MASTER_IP}


:6443″



$


# 当前部署的节点 IP



$


export


NODE_IP=1


92.168.1.206



$


#导入用到的其它全局变量:ETCD_ENDPOINTS、FLANNEL_ETCD_PREFIX、CLUSTER_CIDR、CLUSTER_DNS_SVC_IP、CLUSTER_DNS_DOMAIN、SERVICE_CIDR



$


source


/root/local/bin/environment.sh

$

安装和配置 flanneld


参考


05-部署Flannel网络.md

安装和配置 docker

下载最新的 docker 二进制文件


$ wget



https://get.docker.com/builds/Linux/x86_64/docker-17.04.0-ce.tgz




$ tar -xvf docker-17.04.0-ce.tgz

$ cp docker/docker


*


/root/local/bin

$ cp docker/completion/bash/docker /etc/bash_completion.d/

$

创建 docker 的 systemd unit 文件


$ catdocker.service

[Unit]

Description=Docker Application ContainerEngine

Documentation=http://docs.docker.io


[Service]

Environment=


“PATH=/root/local/bin:/bin:/sbin:/usr/bin:/usr/sbin”



EnvironmentFile=-/run/flannel/docker

ExecStart=/root/local/bin/dockerd –log-level=error$DOCKER_NETWORK_OPTIONS

ExecReload=/bin/kill -s HUP$MAINPID

Restart=on-failure

RestartSec=5

LimitNOFILE=infinity

LimitNPROC=infinity

LimitCORE=infinity

Delegate=yes

KillMode=process


[Install]

WantedBy=multi-user.target

  • dockerd 运行时会调用其它 docker 命令,如 docker-proxy,所以需要将 docker 命令所在的目录加到 PATH 环境变量中;
  • flanneld 启动时将网络配置写入到 /run/flannel/docker 文件中的变量 DOCKER_NETWORK_OPTIONS,dockerd 命令行上指定该变量值来设置 docker0 网桥参数;
  • 如果指定了多个 EnvironmentFile 选项,则必须将 /run/flannel/docker 放在最后(确保 docker0 使用 flanneld 生成的 bip 参数);
  • 不能关闭默认开启的 –iptables 和 –ip-masq 选项;
  • 如果内核版本比较新,建议使用 overlay 存储驱动;
  • docker 从 1.13 版本开始,可能将 iptables FORWARD chain的默认策略设置为DROP,从而导致 ping 其它 Node 上的 Pod IP 失败,遇到这种情况时,需要手动设置策略为 ACCEPT:


    $ sudo iptables -P FORWARD ACCEPT

    $
  • 为了加快 pull image 的速度,可以使用国内的仓库镜像服务器,同时增加下载的并发数。(如果 dockerd 已经运行,则需要重启 dockerd 生效。)


    $ cat /etc/docker/daemon.json

    {




    “registry-mirrors”


    : [






    https://docker.mirrors.ustc.edu.cn






    ,


    “hub-mirror.c.163.com”


    ],



    “max-concurrent-downloads”


    : 10

    }


完整 unit 见


docker.service

启动 dockerd


$ sudo cp docker.service/etc/systemd/system/docker.service

$ sudo systemctl daemon-reload

$ sudo systemctl stop firewalld

$ sudo systemctl disable firewalld

$ sudo iptables -F


&&


sudo iptables -X


&&


sudo iptables -F -t nat


&&


sudo iptables -X -t nat

$ sudo systemctl


enable


docker

$ sudo systemctl start docker

$

  • 需要关闭 firewalld,否则可能会重复创建的 iptables 规则;
  • 最好清理旧的 iptables rules 和 chains 规则;

检查 docker 服务


$ docker version

$

安装和配置 kubelet

kubelet 启动时向 kube-apiserver 发送 TLSbootstrapping 请求,需要先将 bootstrap token 文件中的 kubelet-bootstrap 用户赋予system:node-bootstrapper 角色,然后 kubelet 才有权限创建认证请求(certificatesigningrequests):

(这个只要在第一个


node


节点上执行一次)


$ kubectl create clusterrolebindingkubelet-bootstrap –clusterrole=system:node-bootstrapper–user=kubelet-bootstrap

$

  • –user=kubelet-bootstrap 是文件 /etc/kubernetes/token.csv 中指定的用户名,同时也写入了文件 /etc/kubernetes/bootstrap.kubeconfig;

下载最新的 kubelet 和 kube-proxy 二进制文件


$ wget



https://dl.k8s.io/v1.6.2/kubernetes-server-linux-amd64.tar.gz




$ tar -xzvf kubernetes-server-linux-amd64.tar.gz

$


cd


kubernetes

$ tar -xzvf  kubernetes-src.tar.gz

$ sudo cp -r ./server/bin/{kube-proxy,kubelet} /root/local/bin/

$

创建 kubelet bootstrapping kubeconfig 文件


$


# 设置集群参数



$ kubectl config set-cluster kubernetes \

–certificate-authority=/etc/kubernetes/ssl/ca.pem \

–embed-certs=true \

–server=${KUBE_APISERVER} \

–kubeconfig=bootstrap.kubeconfig

$


# 设置客户端认证参数



$ kubectl config set-credentials kubelet-bootstrap \

–token=${BOOTSTRAP_TOKEN} \

–kubeconfig=bootstrap.kubeconfig

$


# 设置上下文参数



$ kubectl config set-context default \

–cluster=kubernetes \

–user=kubelet-bootstrap \

–kubeconfig=bootstrap.kubeconfig

$


# 设置默认上下文



$ kubectl config use-context default –kubeconfig=bootstrap.kubeconfig

$ mv bootstrap.kubeconfig /etc/kubernetes/

  • –embed-certs 为 true 时表示将 certificate-authority 证书写入到生成的 bootstrap.kubeconfig 文件中;
  • 设置 kubelet 客户端认证参数时没有指定秘钥和证书,后续由 kube-apiserver 自动生成;

创建 kubelet 的 systemd unit 文件


$ sudo mkdir /var/lib/kubelet


# 必须先创建工作目录



$ cat


>


kubelet.service


<<EOF





[Unit]





Description=KubernetesKubelet





Documentation=https://github.com/GoogleCloudPlatform/kubernetes





After=docker.service





Requires=docker.service


[Service]





WorkingDirectory=/var/lib/kubelet





ExecStart=/root/local/bin/kubelet\\





–address=${NODE_IP} \\





–hostname-override=${NODE_IP} \\





–pod-infra-container-image=registry.access.redhat.com/rhel7/pod-infrastructure:latest\\





–experimental-bootstrap-kubeconfig=/etc/kubernetes/bootstrap.kubeconfig\\





–kubeconfig=/etc/kubernetes/kubelet.kubeconfig \\





–require-kubeconfig \\





–cert-dir=/etc/kubernetes/ssl \\





–cluster-dns=${CLUSTER_DNS_SVC_IP} \\





–cluster-domain=${CLUSTER_DNS_DOMAIN} \\





–hairpin-mode promiscuous-bridge \\





–allow-privileged=true \\





–serialize-image-pulls=false \\





–logtostderr=true \\





–v=2





ExecStopPost=/sbin/iptables-A INPUT -s 10.0.0.0/8 -p tcp –dport 4194 -j ACCEPT





ExecStopPost=/sbin/iptables-A INPUT -s 172.16.0.0/12 -p tcp –dport 4194 -j ACCEPT





ExecStopPost=/sbin/iptables-A INPUT -s 192.168.0.0/16 -p tcp –dport 4194 -j ACCEPT





ExecStopPost=/sbin/iptables-A INPUT -p tcp –dport 4194 -j DROP





Restart=on-failure





RestartSec=5


[Install]





WantedBy=multi-user.target





EOF

  • –address 不能设置为 127.0.0.1,否则后续 Pods 访问 kubelet 的 API 接口时会失败,因为 Pods 访问的 127.0.0.1 指向自己而不是 kubelet;
  • 如果设置了 –hostname-override 选项,则 kube-proxy 也需要设置该选项,否则会出现找不到 Node 的情况;
  • –experimental-bootstrap-kubeconfig 指向 bootstrap kubeconfig 文件,kubelet 使用该文件中的用户名和 token 向 kube-apiserver 发送 TLS Bootstrapping 请求;
  • 管理员通过了 CSR 请求后,kubelet 自动在 –cert-dir 目录创建证书和私钥文件(kubelet-client.crt 和 kubelet-client.key),然后写入 –kubeconfig 文件(自动创建 –kubeconfig 指定的文件);
  • 建议在 –kubeconfig 配置文件中指定 kube-apiserver 地址,如果未指定 –api-servers 选项,则必须指定 –require-kubeconfig 选项后才从配置文件中读取 kue-apiserver 的地址,否则 kubelet 启动后将找不到 kube-apiserver (日志中提示未找到 API Server),kubectl get nodes 不会返回对应的 Node 信息;
  • –cluster-dns 指定 kubedns 的 Service IP(可以先分配,后续创建 kubedns 服务时指定该 IP),–cluster-domain 指定域名后缀,这两个参数同时指定后才会生效;
  • kubelet cAdvisor 默认在所有接口监听 4194 端口的请求,对于有外网的机器来说不安全,ExecStopPost 选项指定的 iptables 规则只允许内网机器访问 4194 端口;


完整 unit 见


kubelet.service

启动 kubelet


$ sudo cp kubelet.service/etc/systemd/system/kubelet.service

$ sudo systemctl daemon-reload

$ sudo systemctl


enable


kubelet

$ sudo systemctl start kubelet

$ systemctl status kubelet

$

通过 kubelet 的 TLS 证书请求

kubelet 首次启动时向 kube-apiserver 发送证书签名请求,必须通过后kubernetes 系统才会将该 Node 加入到集群。

查看未授权的 CSR 请求:


$ kubectl get csr

NAME        AGE       REQUESTOR           CONDITION

csr-2b308   4m        kubelet-bootstrap   Pending

$ kubectl get nodes

No resources found.

通过 CSR 请求:


$ kubectl certificate approvecsr-2b308

certificatesigningrequest


“csr-2b308”


approved

$ kubectl get nodes

NAME        STATUS    AGE      VERSION

10.64.3.7   Ready     49m      v1.6.2

自动生成了 kubelet kubeconfig 文件和公私钥:


$ ls -l/etc/kubernetes/kubelet.kubeconfig

-rw——- 1 root root 2284 Apr  7 02:07/etc/kubernetes/kubelet.kubeconfig

$ ls -l /etc/kubernetes/ssl/kubelet


*



-rw-r–r– 1 root root 1046 Apr  7 02:07/etc/kubernetes/ssl/kubelet-client.crt

-rw——- 1 root root  227 Apr  7 02:04 /etc/kubernetes/ssl/kubelet-client.key

-rw-r–r– 1 root root 1103 Apr  7 02:07/etc/kubernetes/ssl/kubelet.crt

-rw——- 1 root root 1675 Apr  7 02:07/etc/kubernetes/ssl/kubelet.key

配置 kube-proxy

创建 kube-proxy 证书

创建 kube-proxy 证书签名请求:


$ catkube-proxy-csr.json

{




“CN”


:


“system:kube-proxy”


,



“hosts”


: [],



“key”


: {




“algo”


:


“rsa”


,



“size”


: 2048

},



“names”


: [

{




“C”


:


“CN”


,



“ST”


:


“BeiJing”


,



“L”


:


“BeiJing”


,



“O”


:


“k8s”


,



“OU”


:


“System”



}

]

}

  • CN 指定该证书的 User 为 system:kube-proxy;
  • kube-apiserver 预定义的 RoleBinding system:node-proxier 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;
  • hosts 属性值为空列表;

生成 kube-proxy 客户端证书和私钥:


$ cfssl gencert-ca=/etc/kubernetes/ssl/ca.pem \

-ca-key=/etc/kubernetes/ssl/ca-key.pem\

-config=/etc/kubernetes/ssl/ca-config.json \

-profile=kubernetes  kube-proxy-csr.json


|


cfssljson -bare kube-proxy

$ ls kube-proxy


*



kube-proxy.csr  kube-proxy-csr.json  kube-proxy-key.pem  kube-proxy.pem

$ sudo mv kube-proxy


*


.pem /etc/kubernetes/ssl/

$ rm kube-proxy.csr kube-proxy-csr.json

$

创建 kube-proxy kubeconfig 文件


$


# 设置集群参数



$ kubectl config set-cluster kubernetes \

–certificate-authority=/etc/kubernetes/ssl/ca.pem \

–embed-certs=true \

–server=${KUBE_APISERVER} \

–kubeconfig=kube-proxy.kubeconfig

$


# 设置客户端认证参数



$ kubectl config set-credentials kube-proxy \

–client-certificate=/etc/kubernetes/ssl/kube-proxy.pem \

–client-key=/etc/kubernetes/ssl/kube-proxy-key.pem \

–embed-certs=true \

–kubeconfig=kube-proxy.kubeconfig

$


# 设置上下文参数



$ kubectl config set-context default \

–cluster=kubernetes \

–user=kube-proxy \

–kubeconfig=kube-proxy.kubeconfig

$


# 设置默认上下文



$ kubectl config use-context default –kubeconfig=kube-proxy.kubeconfig

$ mv kube-proxy.kubeconfig /etc/kubernetes/

  • 设置集群参数和客户端认证参数时 –embed-certs 都为 true,这会将 certificate-authority、client-certificate 和 client-key 指向的证书文件内容写入到生成的 kube-proxy.kubeconfig 文件中;
  • kube-proxy.pem 证书中 CN 为 system:kube-proxy,kube-apiserver 预定义的 RoleBinding cluster-admin 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;

创建 kube-proxy 的 systemd unit 文件


$ sudo mkdir -p /var/lib/kube-proxy


# 必须先创建工作目录



$ cat


>


kube-proxy.service


<<EOF





[Unit]





Description=KubernetesKube-Proxy Server





Documentation=https://github.com/GoogleCloudPlatform/kubernetes





After=network.target


[Service]





WorkingDirectory=/var/lib/kube-proxy





ExecStart=/root/local/bin/kube-proxy\\





–bind-address=${NODE_IP} \\





–hostname-override=${NODE_IP} \\





–cluster-cidr=${SERVICE_CIDR} \\





–kubeconfig=/etc/kubernetes/kube-proxy.kubeconfig \\





–logtostderr=true \\





–v=2





Restart=on-failure





RestartSec=5





LimitNOFILE=65536


[Install]





WantedBy=multi-user.target





EOF

  • –hostname-override 参数值必须与 kubelet 的值一致,否则 kube-proxy 启动后会找不到该 Node,从而不会创建任何 iptables 规则;
  • –cluster-cidr 必须与 kube-apiserver 的 –service-cluster-ip-range 选项值一致;
  • kube-proxy 根据 –cluster-cidr 判断集群内部和外部流量,指定 –cluster-cidr 或 –masquerade-all 选项后 kube-proxy 才会对访问 Service IP 的请求做 SNAT;
  • –kubeconfig 指定的配置文件嵌入了 kube-apiserver 的地址、用户名、证书、秘钥等请求和认证信息;
  • 预定义的 RoleBinding cluster-admin 将User system:kube-proxy 与 Role system:node-proxier 绑定,该 Role 授予了调用 kube-apiserver Proxy 相关 API 的权限;


完整 unit 见


kube-proxy.service

启动 kube-proxy


$ sudo cp kube-proxy.service/etc/systemd/system/

$ sudo systemctl daemon-reload

$ sudo systemctl


enable


kube-proxy

$ sudo systemctl start kube-proxy

$ systemctl status kube-proxy

$

验证集群功能

定义文件:


$ cat nginx-ds.yml

apiVersion: v1

kind: Service

metadata:

name: nginx-ds

labels:

app: nginx-ds

spec:

type: NodePort

selector:

app: nginx-ds

ports:

– name: http

port: 80

targetPort: 80




apiVersion: extensions/v1beta1

kind: DaemonSet

metadata:

name: nginx-ds

labels:

addonmanager.kubernetes.io/mode:Reconcile

spec:

template:

metadata:

labels:

app: nginx-ds

spec:

containers:

– name: my-nginx

image: nginx:1.7.9

ports:

– containerPort: 80

创建 Pod 和服务:


$ kubectl create -fnginx-ds.yml

service


“nginx-ds”


created

daemonset


“nginx-ds”


created

检查节点状态


$ kubectl get nodes

NAME            STATUS    AGE      VERSION

192.168.1.206   Ready    1d        v1.6.2

192.168.1.207   Ready    1d        v1.6.2

192.168.1.208   Ready    1d        v1.6.2

都为Ready 时正常。

检查各 Node 上的 Pod IP 连通性


$ kubectl get pods  -o wide


|


grepnginx-ds

nginx-ds-6ktz8              1/1       Running            0          5m        172.30.


43


.19   1


92.168.1.206



nginx-ds-6ktz9              1/1       Running            0          5m        172.30.


44


.20   1


92.168.1.207

可见,nginx-ds 的 PodIP 分别是 172.30.43.19、172.30.44.20,在所有 Node 上分别 ping 这两个 IP,看是否连通。

检查服务 IP 和端口可达性


$ kubectl get svc


|


grep nginx-ds

nginx-ds     10.254.136.178


<


nodes


>


80:8744/TCP         11m

可见:

  • 服务IP:10.254.136.178
  • 服务端口:80
  • NodePort端口:8744

在所有 Node 上执行:


$ curl 10.254.136.178


# `kubectl get svc |grep nginx-ds`输出中的服务 IP



$

预期输出 nginx 欢迎页面内容。

检查服务的 NodePort 可达性

在所有 Node 上执行:


$


export


NODE_IP=1


92.168.1.207


# 当前 Node 的 IP



$


export


NODE_PORT=8744


# `kubectl get svc|grep nginx-ds` 输出中 80 端口映射的 NodePort



$ curl ${NODE_IP}:${NODE_PORT}

$

预期输出 nginx 欢迎页面内容。




08-部署DNS插件




部署 kubedns 插件


官方文件目录:


kubernetes/cluster/addons/dns

使用的文件:


$ ls


*


.yaml


*


.base

kubedns-cm.yaml  kubedns-sa.yaml  kubedns-controller.yaml.base  kubedns-svc.yaml.base


已经修改好的 yaml 文件见:


dns



系统预定义的 RoleBinding


预定义的 RoleBinding


system:kube-dns


将 kube-system命名空间的


kube-dns


ServiceAccount 与


system:kube-dns


Role 绑定, 该 Role 具有访问kube-apiserver DNS 相关 API 的权限;


$ kubectl get clusterrolebindingssystem:kube-dns -o yaml

apiVersion: rbac.authorization.k8s.io/v1beta1

kind: ClusterRoleBinding

metadata:

annotations:

rbac.authorization.kubernetes.io/autoupdate:


“true”



creationTimestamp:2017-04-06T17:40:47Z

labels:

kubernetes.io/bootstrapping:rbac-defaults

name: system:kube-dns

resourceVersion:


“56”



selfLink:/apis/rbac.authorization.k8s.io/v1beta1/clusterrolebindingssystem%3Akube-dns

uid:2b55cdbe-1af0-11e7-af35-8cdcd4b3be48

roleRef:

apiGroup:rbac.authorization.k8s.io

kind: ClusterRole

name:system:kube-dns

subjects:

– kind: ServiceAccount

name: kube-dns

namespace: kube-system


kubedns-controller.yaml


中定义的 Pods 时使用了


kubedns-sa.yaml


文件定义的


kube-dns


ServiceAccount,所以具有访问kube-apiserver DNS 相关 API 的权限;

配置 kube-dns ServiceAccount

无需修改;


配置


kube-dns


服务


$ diff kubedns-svc.yaml.basekubedns-svc.yaml

30c30



<


clusterIP: __PILLAR__DNS__SERVER__





>


clusterIP: 10.254.0.2


  • 需要将 spec.clusterIP 设置为



    集群环境变量



    中变量


    CLUSTER_DNS_SVC_IP


    值,这个 IP 需要和 kubelet 的


    —cluster-dns


    参数值一致;


配置


kube-dns


Deployment


$ diff kubedns-controller.yaml.basekubedns-controller.yaml

58c58



<


image:gcr.io/google_containers/k8s-dns-kube-dns-amd64:1.14.1





>


image:xuejipeng/k8s-dns-kube-dns-amd64:v1.14.1

88c88



<


—domain=__PILLAR__DNS__DOMAIN__.





>


—domain=cluster.local.

92c92



<


__PILLAR__FEDERATIONS__DOMAIN__MAP__





>




#__PILLAR__FEDERATIONS__DOMAIN__MAP__



110c110



<


image:gcr.io/google_containers/k8s-dns-dnsmasq-nanny-amd64:1.14.1





>


image:xuejipeng/k8s-dns-dnsmasq-nanny-amd64:v1.14.1

129c129



<


—server=/__PILLAR__DNS__DOMAIN__/127.0.0.1#10053





>


—server=/cluster.local./127.0.0.1#10053

148c148



<


image:gcr.io/google_containers/k8s-dns-sidecar-amd64:1.14.1





>


image:xuejipeng/k8s-dns-sidecar-amd64:v1.14.1

161,162c161,162



<


—probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.__PILLAR__DNS__DOMAIN__,5,A



<


—probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.__PILLAR__DNS__DOMAIN__,5,A





>


—probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local.,5,A



>


—probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local.,5,A


  • –domain






    集群环境文档



    变量


    CLUSTER_DNS_DOMAIN


    的值;

  • 使用系统已经做了 RoleBinding 的


    kube-dns


    ServiceAccount,该账户具有访问 kube-apiserver DNS 相关 API 的权限;

执行所有定义文件


$


pwd



/root/kubernetes-git/cluster/addons/dns

$ ls


*


.yaml

kubedns-cm.yaml kubedns-controller.yaml kubedns-sa.yaml kubedns-svc.yaml

$ kubectl create -f


.



$

检查 kubedns 功能

新建一个 Deployment


$ cat my-nginx.yaml

apiVersion: extensions/v1beta1

kind: Deployment

metadata:

name: my-nginx

spec:

replicas: 2

template:

metadata:

labels:

run: my-nginx

spec:

containers:

– name: my-nginx

image: nginx:1.7.9

ports:

– containerPort: 80

$ kubectl create -f my-nginx.yaml

$


Export 该 Deployment,生成


my-nginx


服务


$ kubectl expose deploymy-nginx

$ kubectl get services –all-namespaces


|


grepmy-nginx

default       my-nginx               10.254.86.48


<


none


>


80/TCP          1d


创建另一个 Pod,查看


/etc/resolv.conf


是否包含


kubelet


配置的


–cluster-dns





–cluster-domain


,是否能够将服务


my-nginx


解析到上面显示的 ClusterIP


10.254.86.48


$ cat pod-nginx.yaml

apiVersion: v1

kind: Pod

metadata:

name: nginx

spec:

containers:

– name: nginx

image: nginx:1.7.9

ports:

– containerPort: 80

$ kubectl create -f pod-nginx.yaml

$ kubectl


exec


nginx -i -t — /bin/bash

root@nginx:/


# cat/etc/resolv.conf



nameserver 10.254.0.2

search default.svc.cluster.local svc.cluster.local cluster.localtjwq01.ksyun.com

options ndots:5


root@nginx:/


# ping my-nginx



PING my-nginx.default.svc.cluster.local (10.254.86.48): 48 databytes

^C— my-nginx.default.svc.cluster.local ping statistics —

2 packets transmitted, 0 packets received, 100% packet loss


root@nginx:/


# ping kubernetes



PING kubernetes.default.svc.cluster.local (10.254.0.1): 48 databytes

^C— kubernetes.default.svc.cluster.local ping statistics —

1 packets transmitted, 0 packets received, 100% packet loss


root@nginx:/


# pingkube-dns.kube-system.svc.cluster.local



PING kube-dns.kube-system.svc.cluster.local (10.254.0.2): 48 databytes

^C— kube-dns.kube-system.svc.cluster.local ping statistics —

1 packets transmitted, 0 packets received, 100% packet loss


附件:

kubedns-cm.yaml

apiVersion: v1

kind: ConfigMap

metadata:

name: kube-dns

namespace: kube-system

labels:

addonmanager.kubernetes.io/mode: EnsureExists

kubedns-controller.yaml

apiVersion: extensions/v1beta1

kind: Deployment

metadata:

name: kube-dns

namespace: kube-system

labels:

k8s-app: kube-dns

kubernetes.io/cluster-service: “true”

addonmanager.kubernetes.io/mode: Reconcile

spec:

# replicas: not specified here:

# 1. In order to make Addon Manager do not reconcile this replicas parameter.

# 2. Default is 1.

# 3. Will be tuned in real time if DNS horizontal auto-scaling is turned on.

strategy:

rollingUpdate:

maxSurge: 10%

maxUnavailable: 0

selector:

matchLabels:

k8s-app: kube-dns

template:

metadata:

labels:

k8s-app: kube-dns

annotations:

scheduler.alpha.kubernetes.io/critical-pod: ”

spec:

tolerations:

– key: “CriticalAddonsOnly”

operator: “Exists”

volumes:

– name: kube-dns-config

configMap:

name: kube-dns

optional: true

containers:

– name: kubedns

image: xuejipeng/k8s-dns-kube-dns-amd64:v1.14.1

resources:

# TODO: Set memory limits when we’ve profiled the container for large

# clusters, then set request = limit to keep this container in

# guaranteed class. Currently, this container falls into the

# “burstable” category so the kubelet doesn’t backoff from restarting it.

limits:

memory: 170Mi

requests:

cpu: 100m

memory: 70Mi

livenessProbe:

httpGet:

path: /healthcheck/kubedns

port: 10054

scheme: HTTP

initialDelaySeconds: 60

timeoutSeconds: 5

successThreshold: 1

failureThreshold: 5

readinessProbe:

httpGet:

path: /readiness

port: 8081

scheme: HTTP

# we poll on pod startup for the Kubernetes master service and

# only setup the /readiness HTTP server once that’s available.

initialDelaySeconds: 3

timeoutSeconds: 5

args:

– –domain=cluster.local.

– –dns-port=10053

– –config-dir=/kube-dns-config

– –v=2

#__PILLAR__FEDERATIONS__DOMAIN__MAP__

env:

– name: PROMETHEUS_PORT

value: “10055”

ports:

– containerPort: 10053

name: dns-local

protocol: UDP

– containerPort: 10053

name: dns-tcp-local

protocol: TCP

– containerPort: 10055

name: metrics

protocol: TCP

volumeMounts:

– name: kube-dns-config

mountPath: /kube-dns-config

– name: dnsmasq

image: xuejipeng/k8s-dns-dnsmasq-nanny-amd64:v1.14.1

livenessProbe:

httpGet:

path: /healthcheck/dnsmasq

port: 10054

scheme: HTTP

initialDelaySeconds: 60

timeoutSeconds: 5

successThreshold: 1

failureThreshold: 5

args:

– -v=2

– -logtostderr

– -configDir=/etc/k8s/dns/dnsmasq-nanny

– -restartDnsmasq=true

– —

– -k

– –cache-size=1000

– –log-facility=-

– –server=/cluster.local./127.0.0.1#10053

– –server=/in-addr.arpa/127.0.0.1#10053

– –server=/ip6.arpa/127.0.0.1#10053

ports:

– containerPort: 53

name: dns

protocol: UDP

– containerPort: 53

name: dns-tcp

protocol: TCP

# see:

https://github.com/kubernetes/kubernetes/issues/29055

for details

resources:

requests:

cpu: 150m

memory: 20Mi

volumeMounts:

– name: kube-dns-config

mountPath: /etc/k8s/dns/dnsmasq-nanny

– name: sidecar

image: xuejipeng/k8s-dns-sidecar-amd64:v1.14.1

livenessProbe:

httpGet:

path: /metrics

port: 10054

scheme: HTTP

initialDelaySeconds: 60

timeoutSeconds: 5

successThreshold: 1

failureThreshold: 5

args:

– –v=2

– –logtostderr

– –probe=kubedns,127.0.0.1:10053,kubernetes.default.svc.cluster.local.,5,A

– –probe=dnsmasq,127.0.0.1:53,kubernetes.default.svc.cluster.local.,5,A

ports:

– containerPort: 10054

name: metrics

protocol: TCP

resources:

requests:

memory: 20Mi

cpu: 10m

dnsPolicy: Default  # Don’t use cluster DNS.

serviceAccountName: kube-dns

kubedns-sa.yaml

apiVersion: v1

kind: ServiceAccount

metadata:

name: kube-dns

namespace: kube-system

labels:

kubernetes.io/cluster-service: “true”

addonmanager.kubernetes.io/mode: Reconcile

kubedns-svc.yaml

apiVersion: v1

kind: Service

metadata:

name: kube-dns

namespace: kube-system

labels:

k8s-app: kube-dns

kubernetes.io/cluster-service: “true”

addonmanager.kubernetes.io/mode: Reconcile

kubernetes.io/name: “KubeDNS”

spec:

selector:

k8s-app: kube-dns

clusterIP: 10.254.0.2

ports:

– name: dns

port: 53

protocol: UDP

– name: dns-tcp

port: 53

protocol: TCP






09-部署Dashboard插件




部署 dashboard 插件


官方文件目录:


kubernetes/cluster/addons/dashboard

使用的文件:


$ ls


*


.yaml

dashboard-controller.yaml dashboard-rbac.yaml dashboard-service.yaml


  • 新加了


    dashboard-rbac.yaml


    文件,定义 dashboard 使用的 RoleBinding。


由于


kube-apiserver


启用了


RBAC


授权,而官方源码目录的


dashboard-controller.yaml


没有定义授权的ServiceAccount,所以后续访问


kube-apiserver


的 API 时会被拒绝,前端界面提示:



解决办法是:定义一个名为 dashboard 的ServiceAccount,然后将它和 Cluster Role view 绑定,具体参考


dashboard-rbac.yaml文件




已经修改好的 yaml 文件见:


dashboard



配置dashboard-service


$ diff dashboard-service.yaml.origdashboard-service.yaml

10a11



>


type: NodePort


  • 指定端口类型为 NodePort,这样外界可以通过地址 nodeIP:nodePort 访问 dashboard;

配置dashboard-controller


20a21



>


serviceAccountName: dashboard

23c24



<


image:gcr.io/google_containers/kubernetes-dashboard-amd64:v1.6.0





>


image:cokabug/kubernetes-dashboard-amd64:v1.6.0


  • 使用名为 dashboard 的自定义 ServiceAccount;

执行所有定义文件


$


pwd



/root/kubernetes/cluster/addons/dashboard

$ ls


*


.yaml

dashboard-controller.yaml dashboard-rbac.yaml dashboard-service.yaml

$ kubectl create -f


.



$

检查执行结果

查看分配的 NodePort


$ kubectl get serviceskubernetes-dashboard -n kube-system

NAME                   CLUSTER-IP       EXTERNAL-IP   PORT(S)        AGE

kubernetes-dashboard  10.254.224.130


<


nodes


>


80:30312/TCP   25s


  • NodePort 30312映射到 dashboard pod 80端口;

检查 controller


$ kubectl get deploymentkubernetes-dashboard  -nkube-system

NAME                   DESIRED   CURRENT  UP-TO-DATE   AVAILABLE   AGE

kubernetes-dashboard   1         1         1            1           3m

$ kubectl get pods  -n kube-system


|


grepdashboard

kubernetes-dashboard-1339745653-pmn6z  1/1       Running   0         4m

访问dashboard


  1. kubernetes-dashboard 服务暴露了 NodePort,可以使用



    http://NodeIP:nodePort



    地址访问 dashboard;

  2. 通过 kube-apiserver 访问 dashboard;

  3. 通过 kubectl proxy 访问 dashboard:

通过 kubectl proxy 访问 dashboard

启动代理


$ kubectl proxy –address=





192.168.1.206





–port=8086 –accept-hosts=


‘^*$’



Starting to serve on


192.168.1.206


:8086


  • 需要指定


    –accept-hosts


    选项,否则浏览器访问 dashboard 页面时提示 “Unauthorized”;


浏览器访问 URL:


http://


192.168.1.206


:8086/ui


自动跳转到:


http://


192.168.1.206


:8086/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard/#/workload?namespace=default

通过 kube-apiserver 访问dashboard

获取集群服务地址列表


$ kubectlcluster-info

Kubernetes master is running at



https://1


92.168.1.206


:6443




KubeDNS is running at



https://


192.168.1.206


:6443/api/v1/proxy/namespaces/kube-system/services/kube-dns




kubernetes-dashboard is running at



https://1


92.168.1.206


:6443/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard

由于 kube-apiserver 开启了 RBAC 授权,而浏览器访问kube-apiserver 的时候使用的是匿名证书,所以访问安全端口会导致授权失败。这里需要使用非安全端口访问 kube-apiserver:


浏览器访问 URL:


http://1


92.168.1.206


:8080/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard


由于缺少 Heapster 插件,当前dashboard 不能展示 Pod、Nodes 的 CPU、内存等 metric 图形;

附件:

dashboard-controller.yaml

apiVersion: extensions/v1beta1

kind: Deployment

metadata:

name: kubernetes-dashboard

namespace: kube-system

labels:

k8s-app: kubernetes-dashboard

kubernetes.io/cluster-service: “true”

addonmanager.kubernetes.io/mode: Reconcile

spec:

selector:

matchLabels:

k8s-app: kubernetes-dashboard

template:

metadata:

labels:

k8s-app: kubernetes-dashboard

annotations:

scheduler.alpha.kubernetes.io/critical-pod: ”

spec:

serviceAccountName: dashboard

containers:

– name: kubernetes-dashboard

image: cokabug/kubernetes-dashboard-amd64:v1.6.0

resources:

# keep request = limit to keep this container in guaranteed class

limits:

cpu: 100m

memory: 50Mi

requests:

cpu: 100m

memory: 50Mi

ports:

– containerPort: 9090

livenessProbe:

httpGet:

path: /

port: 9090

initialDelaySeconds: 30

timeoutSeconds: 30

tolerations:

– key: “CriticalAddonsOnly”

operator: “Exists”

dashboard-rbac.yaml

apiVersion: v1

kind: ServiceAccount

metadata:

name: dashboard

namespace: kube-system

kind: ClusterRoleBinding

apiVersion: rbac.authorization.k8s.io/v1alpha1

metadata:

name: dashboard

subjects:

– kind: ServiceAccount

name: dashboard

namespace: kube-system

roleRef:

kind: ClusterRole

name: cluster-admin

apiGroup: rbac.authorization.k8s.io

dashboard-service.yaml

apiVersion: v1

kind: Service

metadata:

name: kubernetes-dashboard

namespace: kube-system

labels:

k8s-app: kubernetes-dashboard

kubernetes.io/cluster-service: “true”

addonmanager.kubernetes.io/mode: Reconcile

spec:

type: NodePort

selector:

k8s-app: kubernetes-dashboard

ports:

– port: 80

targetPort: 9090



10-部署Heapster插件


部署 heapster 插件





heapster release 页面


下载最新版本的 heapster


$ wget



https://github.com/kubernetes/heapster/archive/v1.3.0.zip




$ unzip v1.3.0.zip

$ mv v1.3.0.zip heapster-1.3.0

$


官方文件目录:


heapster-1.3.0/deploy/kube-config/influxdb


$


cd


heapster-1.3.0/deploy/kube-config/influxdb

$ ls


*


.yaml

grafana-deployment.yaml heapster-deployment.yaml heapster-service.yaml influxdb-deployment.yaml

grafana-service.yaml    heapster-rbac.yaml       influxdb-cm.yaml      influxdb-service.yaml


  • 新加了


    heapster-rbac.yaml





    influxdb-cm.yaml


    文件,分别定义 RoleBinding 和 inflxudb 的配置;


已经修改好的 yaml 文件见:


heapster



配置 grafana-deployment


$ diff grafana-deployment.yaml.origgrafana-deployment.yaml

16c16



<


image:gcr.io/google_containers/heapster-grafana-amd64:v4.0.2





>


image:lvanneo/heapster-grafana-amd64:v4.0.2

40,41c40,41



<




# value:/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/





<


value: /





>


value:/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/



>




#value: /


  • 如果后续使用 kube-apiserver 或者 kubectl proxy 访问 grafana dashboard,则必须将


    GF_SERVER_ROOT_URL


    设置为


    /api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/


    ,否则后续访问grafana时访问时提示找不到


    http://10.64.3.7:8086/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/api/dashboards/home


    页面;

配置 heapster-deployment


$ diff heapster-deployment.yaml.origheapster-deployment.yaml

13a14



>


serviceAccountName: heapster

16c17



<


image:gcr.io/google_containers/heapster-amd64:v1.3.0-beta.1





>


image:lvanneo/heapster-amd64:v1.3.0-beta.1


  • 使用的是自定义的、名为 heapster 的 ServiceAccount;

配置 influxdb-deployment

influxdb 官方建议使用命令行或 HTTP API 接口来查询数据库,从 v1.1.0版本开始默认关闭 admin UI,将在后续版本中移除 admin UI 插件。

开启镜像中 admin UI的办法如下:先导出镜像中的 influxdb 配置文件,开启admin 插件后,再将配置文件内容写入 ConfigMap,最后挂载到镜像中,达到覆盖原始配置的目的。相关步骤如下:


注意:无需自己导出、修改和创建ConfigMap,可以直接使用放在 manifests 目录下的


ConfigMap文件




$


# 导出镜像中的 influxdb 配置文件



$ docker run –rm –entrypoint


‘cat’


-ti lvanneo/heapster-influxdb-amd64:v1.1.1/etc/config.toml


>


config.toml.orig

$ cp config.toml.orig config.toml

$


# 修改:启用 admin 接口



$ vim config.toml

$ diff config.toml.orig config.toml

35c35



<


enabled =


false







>


enabled =


true



$


# 将修改后的配置写入到 ConfigMap对象中



$ kubectl create configmap influxdb-config –from-file=config.toml  -n kube-system

configmap


“influxdb-config”


created

$


# 将 ConfigMap 中的配置文件挂载到Pod 中,达到覆盖原始配置的目的



$ diff influxdb-deployment.yaml.orig influxdb-deployment.yaml

16c16



<


image:gcr.io/google_containers/heapster-influxdb-amd64:v1.1.1





>


image:lvanneo/heapster-influxdb-amd64:v1.1.1

19a20,21



>


– mountPath: /etc/



>


name:influxdb-config

22a25,27



>


– name: influxdb-config



>


configMap:



>


name: influxdb-config

配置 monitoring-influxdb Service


$ diff influxdb-service.yaml.originfluxdb-service.yaml

12a13



>


type: NodePort

15a17,20



>


name: http



>


– port: 8083



>


targetPort: 8083



>


name: admin


  • 定义端口类型为 NodePort,额外增加了 admin 端口映射,用于后续浏览器访问 influxdb 的 admin UI 界面;

执行所有定义文件


$


pwd



/root/heapster-1.3.0/deploy/kube-config/influxdb

$ ls


*


.yaml

grafana-deployment.yaml heapster-deployment.yaml heapster-service.yaml influxdb-deployment.yaml

grafana-service.yaml    heapster-rbac.yaml       influxdb-cm.yaml      influxdb-service.yaml

$ kubectl create -f


.



$

检查执行结果

检查 Deployment


$ kubectl get deployments -nkube-system


|


grep -E


‘heapster|monitoring’



heapster               1         1         1            1           1m

monitoring-grafana     1         1         1            1           1m

monitoring-influxdb    1         1         1            1           1m

检查 Pods


$ kubectl get pods -n kube-system


|


grep -E


‘heapster|monitoring’



heapster-3273315324-tmxbg              1/1       Running   0         11m

monitoring-grafana-2255110352-94lpn    1/1       Running   0         11m

monitoring-influxdb-884893134-3vb6n    1/1       Running   0         11m

检查 kubernets dashboard 界面,看是显示各 Nodes、Pods 的CPU、内存、负载等利用率曲线图;


访问 grafana


  1. 通过 kube-apiserver 访问:

    获取 monitoring-grafana 服务 URL



    $ kubectl cluster-info

    Kubernetes master is running at



    https://10.64.3.7:6443




    Heapster is running at



    https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/heapster




    KubeDNS is running at



    https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/kube-dns




    kubernetes-dashboard is running at



    https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard




    monitoring-grafana is running at



    https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana




    monitoring-influxdb is running at



    https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb




    $



    由于 kube-apiserver 开启了 RBAC 授权,而浏览器访问 kube-apiserver 的时候使用的是匿名证书,所以访问安全端口会导致授权失败。这里需要使用非安全端口访问 kube-apiserver:

    浏览器访问 URL:



    http://10.64.3.7:8080/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana


  2. 通过 kubectl proxy 访问:

    创建代理



    $ kubectl proxy –address=


    ‘10.64.3.7’


    –port=8086 –accept-hosts=


    ‘^*$’



    Starting to serve on 10.64.3.7:8086



    浏览器访问 URL:


    http://10.64.3.7:8086/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana


访问 influxdb admin UI

获取 influxdb http 8086 映射的 NodePort


$ kubectl get svc -n kube-system


|


grep influxdb

monitoring-influxdb   10.254.255.183


<


nodes


>


8086:8670/TCP,8083:8595/TCP   21m


通过 kube-apiserver 的非安全端口访问influxdb 的 admin UI 界面:



http://10.64.3.7:8080/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb:8083/

在页面的 “Connection Settings” 的 Host 中输入 node IP,Port 中输入 8086 映射的 nodePort 如上面的 8670,点击 “Save” 即可:


附件:

grafana-deployment.yaml

apiVersion: extensions/v1beta1

kind: Deployment

metadata:

name: monitoring-grafana

namespace: kube-system

spec:

replicas: 1

template:

metadata:

labels:

task: monitoring

k8s-app: grafana

spec:

containers:

– name: grafana

image: lvanneo/heapster-grafana-amd64:v4.0.2

ports:

– containerPort: 3000

protocol: TCP

volumeMounts:

– mountPath: /var

name: grafana-storage

env:

– name: INFLUXDB_HOST

value: monitoring-influxdb

– name: GRAFANA_PORT

value: “3000”

# The following env variables are required to make Grafana accessible via

# the kubernetes api-server proxy. On production clusters, we recommend

# removing these env variables, setup auth for grafana, and expose the grafana

# service using a LoadBalancer or a public IP.

– name: GF_AUTH_BASIC_ENABLED

value: “false”

– name: GF_AUTH_ANONYMOUS_ENABLED

value: “true”

– name: GF_AUTH_ANONYMOUS_ORG_ROLE

value: Admin

– name: GF_SERVER_ROOT_URL

# If you’re only using the API Server proxy, set this value instead:

value: /api/v1/proxy/namespaces/kube-system/services/monitoring-grafana/

#value: /

volumes:

– name: grafana-storage

emptyDir: {}

grafana-service.yaml

apiVersion: v1

kind: Service

metadata:

labels:

# For use as a Cluster add-on (

https://github.com/kubernetes/kubernetes/tree/master/cluster/addons

)

# If you are NOT using this as an addon, you should comment out this line.

kubernetes.io/cluster-service: ‘true’

kubernetes.io/name: monitoring-grafana

name: monitoring-grafana

namespace: kube-system

spec:

# In a production setup, we recommend accessing Grafana through an external Loadbalancer

# or through a public IP.

# type: LoadBalancer

# You could also use NodePort to expose the service at a randomly-generated port

ports:

– port : 80

targetPort: 3000

selector:

k8s-app: grafana

heapster-deployment.yaml

apiVersion: extensions/v1beta1

kind: Deployment

metadata:

name: heapster

namespace: kube-system

spec:

replicas: 1

template:

metadata:

labels:

task: monitoring

k8s-app: heapster

spec:

serviceAccountName: heapster

containers:

– name: heapster

image: lvanneo/heapster-amd64:v1.3.0-beta.1

imagePullPolicy: IfNotPresent

command:

– /heapster

– –source=kubernetes:https://kubernetes.default

– –sink=influxdb:http://monitoring-influxdb:8086

heapster-rbac.yaml

apiVersion: v1

kind: ServiceAccount

metadata:

name: heapster

namespace: kube-system

kind: ClusterRoleBinding

apiVersion: rbac.authorization.k8s.io/v1alpha1

metadata:

name: heapster

subjects:

– kind: ServiceAccount

name: heapster

namespace: kube-system

roleRef:

kind: ClusterRole

name: system:heapster

apiGroup: rbac.authorization.k8s.io

heapster-service.yaml

apiVersion: v1

kind: Service

metadata:

labels:

task: monitoring

# For use as a Cluster add-on (

https://github.com/kubernetes/kubernetes/tree/master/cluster/addons

)

# If you are NOT using this as an addon, you should comment out this line.

kubernetes.io/cluster-service: ‘true’

kubernetes.io/name: Heapster

name: heapster

namespace: kube-system

spec:

ports:

– port: 80

targetPort: 8082

selector:

k8s-app: heapster

influxdb-cm.yaml

apiVersion: v1

kind: ConfigMap

metadata:

name: influxdb-config

namespace: kube-system

data:

config.toml: |

reporting-disabled = true

bind-address = “:8088”

[meta]

dir = “/data/meta”

retention-autocreate = true

logging-enabled = true

[data]

dir = “/data/data”

wal-dir = “/data/wal”

query-log-enabled = true

cache-max-memory-size = 1073741824

cache-snapshot-memory-size = 26214400

cache-snapshot-write-cold-duration = “10m0s”

compact-full-write-cold-duration = “4h0m0s”

max-series-per-database = 1000000

max-values-per-tag = 100000

trace-logging-enabled = false

[coordinator]

write-timeout = “10s”

max-concurrent-queries = 0

query-timeout = “0s”

log-queries-after = “0s”

max-select-point = 0

max-select-series = 0

max-select-buckets = 0

[retention]

enabled = true

check-interval = “30m0s”

[admin]

enabled = true

bind-address = “:8083”

https-enabled = false

https-certificate = “/etc/ssl/influxdb.pem”

[shard-precreation]

enabled = true

check-interval = “10m0s”

advance-period = “30m0s”

[monitor]

store-enabled = true

store-database = “_internal”

store-interval = “10s”

[subscriber]

enabled = true

http-timeout = “30s”

insecure-skip-verify = false

ca-certs = “”

write-concurrency = 40

write-buffer-size = 1000

[http]

enabled = true

bind-address = “:8086”

auth-enabled = false

log-enabled = true

write-tracing = false

pprof-enabled = false

https-enabled = false

https-certificate = “/etc/ssl/influxdb.pem”

https-private-key = “”

max-row-limit = 10000

max-connection-limit = 0

shared-secret = “”

realm = “InfluxDB”

unix-socket-enabled = false

bind-socket = “/var/run/influxdb.sock”

[[graphite]]

enabled = false

bind-address = “:2003”

database = “graphite”

retention-policy = “”

protocol = “tcp”

batch-size = 5000

batch-pending = 10

batch-timeout = “1s”

consistency-level = “one”

separator = “.”

udp-read-buffer = 0

[[collectd]]

enabled = false

bind-address = “:25826”

database = “collectd”

retention-policy = “”

batch-size = 5000

batch-pending = 10

batch-timeout = “10s”

read-buffer = 0

typesdb = “/usr/share/collectd/types.db”

[[opentsdb]]

enabled = false

bind-address = “:4242”

database = “opentsdb”

retention-policy = “”

consistency-level = “one”

tls-enabled = false

certificate = “/etc/ssl/influxdb.pem”

batch-size = 1000

batch-pending = 5

batch-timeout = “1s”

log-point-errors = true

[[udp]]

enabled = false

bind-address = “:8089”

database = “udp”

retention-policy = “”

batch-size = 5000

batch-pending = 10

read-buffer = 0

batch-timeout = “1s”

precision = “”

[continuous_queries]

log-enabled = true

enabled = true

run-interval = “1s”

influxdb-deployment.yaml

apiVersion: extensions/v1beta1

kind: Deployment

metadata:

name: monitoring-influxdb

namespace: kube-system

spec:

replicas: 1

template:

metadata:

labels:

task: monitoring

k8s-app: influxdb

spec:

containers:

– name: influxdb

image: lvanneo/heapster-influxdb-amd64:v1.1.1

volumeMounts:

– mountPath: /data

name: influxdb-storage

– mountPath: /etc/

name: influxdb-config

volumes:

– name: influxdb-storage

emptyDir: {}

– name: influxdb-config

configMap:

name: influxdb-config

influxdb-service.yaml

apiVersion: v1

kind: Service

metadata:

labels:

task: monitoring

# For use as a Cluster add-on (

https://github.com/kubernetes/kubernetes/tree/master/cluster/addons

)

# If you are NOT using this as an addon, you should comment out this line.

kubernetes.io/cluster-service: ‘true’

kubernetes.io/name: monitoring-influxdb

name: monitoring-influxdb

namespace: kube-system

spec:

type: NodePort

ports:

– port: 8086

targetPort: 8086

name: http

– port: 8083

targetPort: 8083

name: admin

selector:

k8s-app: influxdb



11-部署EFK插件

部署 EFK 插件


官方文件目录:


kubernetes/cluster/addons/fluentd-elasticsearch


$ ls


*


.yaml

es-controller.yaml es-rbac.yaml es-service.yaml fluentd-es-ds.yaml kibana-controller.yaml kibana-service.yaml fluentd-es-rbac.yaml


  • 新加了


    es-rbac.yaml





    fluentd-es-rbac.yaml


    文件,定义了 elasticsearch 和 fluentd 使用的 Role 和 RoleBinding;


已经修改好的 yaml 文件见:


EFK



配置 es-controller.yaml


$ diff es-controller.yaml.origes-controller.yaml

22a23



>


serviceAccountName: elasticsearch

24c25



<


– image: gcr.io/google_containers/elasticsearch:v2.4.1-2





>


– image: onlyerich/elasticsearch:v2.4.1-2

配置 es-service.yaml

无需配置;

配置 fluentd-es-ds.yaml


$ diff fluentd-es-ds.yaml.origfluentd-es-ds.yaml

23a24



>


serviceAccountName: fluentd

26c27



<


image:gcr.io/google_containers/fluentd-elasticsearch:1.22





>


image:onlyerich/fluentd-elasticsearch:1.22

配置 kibana-controller.yaml


$ diff kibana-controller.yaml.origkibana-controller.yaml

22c22



<


image:gcr.io/google_containers/kibana:v4.6.1-1





>


image: onlyerich/kibana:v4.6.1-1

给 Node 设置标签


DaemonSet


fluentd-es-v1.22


只会调度到设置了标签


beta.kubernetes.io/fluentd-ds-ready=true


的 Node,需要在期望运行 fluentd的 Node 上设置该标签;


$ kubectl get nodes

NAME        STATUS    AGE      VERSION

10.64.3.7   Ready     1d       v1.6.2


$ kubectl label nodes 10.64.3.7beta.kubernetes.io/fluentd-ds-ready=true

node


“10.64.3.7”


labeled

执行定义文件


$


pwd



/root/kubernetes/cluster/addons/fluentd-elasticsearch

$ ls


*


.yaml

es-controller.yaml es-rbac.yaml es-service.yaml fluentd-es-ds.yaml kibana-controller.yaml kibana-service.yaml fluentd-es-rbac.yaml

$ kubectl create -f


.



$

检查执行结果


$ kubectl get deployment -nkube-system


|


grep kibana

kibana-logging         1         1         1            1           2m


$ kubectl get pods -n kube-system


|


grep -E


‘elasticsearch|fluentd|kibana’



elasticsearch-logging-v1-kwc9w         1/1       Running   0         4m

elasticsearch-logging-v1-ws9mk         1/1       Running   0         4m

fluentd-es-v1.22-g76x0                 1/1       Running   0         4m

kibana-logging-324921636-ph7sn         1/1       Running   0         4m


$ kubectl get service  -n kube-system


|


grep-E


‘elasticsearch|kibana’



elasticsearch-logging  10.254.128.156


<


none


>


9200/TCP       3m

kibana-logging         10.254.88.109


<


none


>


5601/TCP        3m

kibana Pod 第一次启动时会用**较长时间(10-20分钟)**来优化和 Cache状态页面,可以 tailf 该 Pod 的日志观察进度:


$ kubectl logskibana-logging-324921636-ph7sn -n kube-system-f

ELASTICSEARCH_URL=http://elasticsearch-logging:9200

server.basePath:/api/v1/proxy/namespaces/kube-system/services/kibana-logging

{



“type”


:


“log”


,


“@timestamp”


:


“2017-04-08T09:30:30Z”


,


“tags”


:[


“info”


,


“optimize”


],


“pid”


:7,


“message”


:


“Optimizingand caching bundles for kibana and statusPage. This may take a fewminutes”


}

{



“type”


:


“log”


,


“@timestamp”


:


“2017-04-08T09:44:01Z”


,


“tags”


:[


“info”


,


“optimize”


],


“pid”


:7,


“message”


:


“Optimizationof bundles for kibana and statusPage complete in 811.00 seconds”


}

{



“type”


:


“log”


,


“@timestamp”


:


“2017-04-08T09:44:02Z”


,


“tags”


:[


“status”


,


“plugin:kibana@1.0.0”


,


“info”


],


“pid”


:7,


“state”


:


“green”


,


“message”


:


“Statuschanged from uninitialized to green – Ready”


,


“prevState”


:


“uninitialized”


,


“prevMsg”


:


“uninitialized”


}

访问 kibana


  1. 通过 kube-apiserver 访问:

    获取 monitoring-grafana 服务 URL



    $ kubectl cluster-info

    Kubernetes master is running at



    https://10.64.3.7:6443




    Elasticsearch is running at



    https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/elasticsearch-logging




    Heapster is running at



    https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/heapster




    Kibana is running at



    https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/kibana-logging




    KubeDNS is running at



    https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/kube-dns




    kubernetes-dashboard is running at



    https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/kubernetes-dashboard




    monitoring-grafana is running at



    https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/monitoring-grafana




    monitoring-influxdb is running at



    https://10.64.3.7:6443/api/v1/proxy/namespaces/kube-system/services/monitoring-influxdb






    由于 kube-apiserver 开启了 RBAC 授权,而浏览器访问 kube-apiserver 的时候使用的是匿名证书,所以访问安全端口会导致授权失败。这里需要使用非安全端口访问 kube-apiserver:

    浏览器访问 URL:



    http://10.64.3.7:8080/api/v1/proxy/namespaces/kube-system/services/kibana-logging


  2. 通过 kubectl proxy 访问:

    创建代理



    $ kubectl proxy –address=


    ‘10.64.3.7’


    –port=8086 –accept-hosts=


    ‘^*$’



    Starting to serve on 10.64.3.7:8086



    浏览器访问 URL:


    http://10.64.3.7:8086/api/v1/proxy/namespaces/kube-system/services/kibana-logging


在 Settings -> Indices页面创建一个 index(相当于 mysql 中的一个 database),选中


Index contains time-based events


,使用默认的


logstash-*


pattern,点击


Create


;



创建Index后,稍等几分钟就可以在


Discover


菜单下看到 ElasticSearchlogging 中汇聚的日志;


附件:

es-controller.yaml

apiVersion: v1

kind: ReplicationController

metadata:

name: elasticsearch-logging-v1

namespace: kube-system

labels:

k8s-app: elasticsearch-logging

version: v1

kubernetes.io/cluster-service: “true”

addonmanager.kubernetes.io/mode: Reconcile

spec:

replicas: 2

selector:

k8s-app: elasticsearch-logging

version: v1

template:

metadata:

labels:

k8s-app: elasticsearch-logging

version: v1

kubernetes.io/cluster-service: “true”

spec:

serviceAccountName: elasticsearch

containers:

– image: onlyerich/elasticsearch:v2.4.1-2

name: elasticsearch-logging

resources:

# need more cpu upon initialization, therefore burstable class

limits:

cpu: 1000m

requests:

cpu: 100m

ports:

– containerPort: 9200

name: db

protocol: TCP

– containerPort: 9300

name: transport

protocol: TCP

volumeMounts:

– name: es-persistent-storage

mountPath: /data

env:

– name: “NAMESPACE”

valueFrom:

fieldRef:

fieldPath: metadata.namespace

volumes:

– name: es-persistent-storage

emptyDir: {}

es-rbac.yaml

apiVersion: v1

kind: ServiceAccount

metadata:

name: elasticsearch

namespace: kube-system

kind: ClusterRoleBinding

apiVersion: rbac.authorization.k8s.io/v1alpha1

metadata:

name: elasticsearch

subjects:

– kind: ServiceAccount

name: elasticsearch

namespace: kube-system

roleRef:

kind: ClusterRole

name: view

apiGroup: rbac.authorization.k8s.io

es-service.yaml

apiVersion: v1

kind: Service

metadata:

name: elasticsearch-logging

namespace: kube-system

labels:

k8s-app: elasticsearch-logging

kubernetes.io/cluster-service: “true”

addonmanager.kubernetes.io/mode: Reconcile

kubernetes.io/name: “Elasticsearch”

spec:

ports:

– port: 9200

protocol: TCP

targetPort: db

selector:

k8s-app: elasticsearch-logging

fluentd-es-ds.yaml

apiVersion: extensions/v1beta1

kind: DaemonSet

metadata:

name: fluentd-es-v1.22

namespace: kube-system

labels:

k8s-app: fluentd-es

kubernetes.io/cluster-service: “true”

addonmanager.kubernetes.io/mode: Reconcile

version: v1.22

spec:

template:

metadata:

labels:

k8s-app: fluentd-es

kubernetes.io/cluster-service: “true”

version: v1.22

# This annotation ensures that fluentd does not get evicted if the node

# supports critical pod annotation based priority scheme.

# Note that this does not guarantee admission on the nodes (#40573).

annotations:

scheduler.alpha.kubernetes.io/critical-pod: ”

spec:

serviceAccountName: fluentd

containers:

– name: fluentd-es

image: onlyerich/fluentd-elasticsearch:1.22

command:

– ‘/bin/sh’

– ‘-c’

– ‘/usr/sbin/td-agent 2>&1 >> /var/log/fluentd.log’

resources:

limits:

memory: 200Mi

requests:

cpu: 100m

memory: 200Mi

volumeMounts:

– name: varlog

mountPath: /var/log

– name: varlibdockercontainers

mountPath: /var/lib/docker/containers

readOnly: true

nodeSelector:

beta.kubernetes.io/fluentd-ds-ready: “true”

tolerations:

– key : “node.alpha.kubernetes.io/ismaster”

effect: “NoSchedule”

terminationGracePeriodSeconds: 30

volumes:

– name: varlog

hostPath:

path: /var/log

– name: varlibdockercontainers

hostPath:

path: /var/lib/docker/containers

fluentd-es-rbac.yaml

apiVersion: v1

kind: ServiceAccount

metadata:

name: fluentd

namespace: kube-system

kind: ClusterRoleBinding

apiVersion: rbac.authorization.k8s.io/v1alpha1

metadata:

name: fluentd

subjects:

– kind: ServiceAccount

name: fluentd

namespace: kube-system

roleRef:

kind: ClusterRole

name: view

apiGroup: rbac.authorization.k8s.io

kibana-controller.yaml

apiVersion: extensions/v1beta1

kind: Deployment

metadata:

name: kibana-logging

namespace: kube-system

labels:

k8s-app: kibana-logging

kubernetes.io/cluster-service: “true”

addonmanager.kubernetes.io/mode: Reconcile

spec:

replicas: 1

selector:

matchLabels:

k8s-app: kibana-logging

template:

metadata:

labels:

k8s-app: kibana-logging

spec:

containers:

– name: kibana-logging

image: onlyerich/kibana:v4.6.1-1

resources:

# keep request = limit to keep this container in guaranteed class

limits:

cpu: 100m

requests:

cpu: 100m

env:

– name: “ELASTICSEARCH_URL”

value: ”

http://elasticsearch-logging:9200

– name: “KIBANA_BASE_URL”

value: “/api/v1/proxy/namespaces/kube-system/services/kibana-logging”

ports:

– containerPort: 5601

name: ui

protocol: TCP

kibana-service.yaml

apiVersion: v1

kind: Service

metadata:

name: kibana-logging

namespace: kube-system

labels:

k8s-app: kibana-logging

kubernetes.io/cluster-service: “true”

addonmanager.kubernetes.io/mode: Reconcile

kubernetes.io/name: “Kibana”

spec:

ports:

– port: 5601

protocol: TCP

targetPort: ui

selector:

k8s-app: kibana-logging




12-部署Docker-Registry




部署私有 docker registry


注意:本文档介绍使用 docker 官方的 registry v2镜像部署私有仓库的步骤,你也可以部署 Harbor 私有仓库(


部署Harbor 私有仓库


)。

本文档讲解部署一个 TLS 加密、HTTP Basic 认证、用 ceph rgw做后端存储的私有 docker registry 步骤,如果使用其它类型的后端存储,则可以从 “创建 docker registry” 节开始;

示例两台机器 IP 如下:

  • ceph rgw: 10.64.3.9
  • docker registry: 10.64.3.7

部署 ceph RGW 节点


$ ceph-deploy rgw create 10.64.3.9


# rgw 默认监听7480端口



$

创建测试账号 demo


$ radosgw-admin user create –uid=demo–display-name=


“cephrgw demo user”



$

创建 demo 账号的子账号 swift

当前 registry 只支持使用 swift 协议访问 ceph rgw 存储,暂时不支持s3 协议;


$ radosgw-admin subuser create –uid demo–subuser=demo:swift –access=full –secret=secretkey –key-type=swift

$

创建 demo:swift 子账号的 sercret key


$ radosgw-admin key create–subuser=demo:swift –key-type=swift –gen-secret

{




“user_id”


:


“demo”


,



“display_name”


:


“cephrgw demo user”


,



“email”


:


“”


,



“suspended”


: 0,



“max_buckets”


: 1000,



“auid”


: 0,



“subusers”


: [

{




“id”


:


“demo:swift”


,



“permissions”


:


“full-control”



}

],



“keys”


: [

{




“user”


:


“demo”


,



“access_key”


:


“5Y1B1SIJ2YHKEHO5U36B”


,



“secret_key”


:


“nrIvtPqUj7pUlccLYPuR3ntVzIa50DToIpe7xFjT”



}

],



“swift_keys”


: [

{




“user”


:


“demo:swift”


,



“secret_key”


:


“aCgVTx3Gfz1dBiFS4NfjIRmvT0sgpHDP6aa0Yfrh”



}

],



“caps”


: [],



“op_mask”


:


“read,write, delete”


,



“default_placement”


:


“”


,



“placement_tags”


: [],



“bucket_quota”


: {




“enabled”


: false,



“max_size_kb”


: -1,



“max_objects”


: -1

},



“user_quota”


: {




“enabled”


: false,



“max_size_kb”


: -1,



“max_objects”


: -1

},



“temp_url_keys”


: []

}

  • aCgVTx3Gfz1dBiFS4NfjIRmvT0sgpHDP6aa0Yfrh 为子账号 demo:swift 的 secret key;

创建 docker registry

创建 registry 使用的 TLS 证书


$ mdir -p registry/{auth,certs}

$ cat registry-csr.json

{




“CN”


:


“registry”


,



“hosts”


: [



“127.0.0.1”


,



“10.64.3.7”



],



“key”


: {




“algo”


:


“rsa”


,



“size”


: 2048

},



“names”


: [

{




“C”


:


“CN”


,



“ST”


:


“BeiJing”


,



“L”


:


“BeiJing”


,



“O”


:


“k8s”


,



“OU”


:


“System”



}

]

}

$ cfssl gencert -ca=/etc/kubernetes/ssl/ca.pem \

-ca-key=/etc/kubernetes/ssl/ca-key.pem \

-config=/etc/kubernetes/ssl/ca-config.json \

-profile=kubernetes registry-csr.json


|


cfssljson -bare registry

$ cp registry.pem registry-key.pem registry/certs

$

  • 这里复用以前创建的 CA 证书和秘钥文件;
  • hosts 字段指定 registry 的 NodeIP;

创建 HTTP Baisc 认证文件


$ docker run –entrypoint htpasswdregistry:2 -Bbn foo foo123


>


auth/htpasswd

$ catauth/htpasswd

foo:$2y$05$I60z69MdluAQ8i1Ka3x3Neb332yz1ioow2C4oroZSOE0fqPogAmZm

配置 registry 参数


$


export


RGW_AUTH_URL=


“http://10.64.3.9:7480/auth/v1”



$


export


RGW_USER=


“demo:swift”



$


export


RGW_SECRET_KEY=


“aCgVTx3Gfz1dBiFS4NfjIRmvT0sgpHDP6aa0Yfrh”



$ cat


>


config.yml


<<


EOF



#



https://docs.docker.com/registry/configuration/#list-of-configuration-options




version: 0.1

log:

level: info

fromatter: text

fields:

service: registry


storage:

cache:

blobdescriptor: inmemory

delete:

enabled:


true



swift:

authurl: ${RGW_AUTH_URL}

username: ${RGW_USER}

password: ${RGW_SECRET_KEY}

container: registry


auth:

htpasswd:

realm: basic-realm

path: /auth/htpasswd


http:

addr: 0.0.0.0:8000

headers:

X-Content-Type-Options:[nosniff]

tls:

certificate:/certs/registry.pem

key: /certs/registry-key.pem


health:

storagedriver:

enabled:


true



interval: 10s

threshold: 3

EOF

  • storage.swift 指定后端使用 swfit 接口协议的存储,这里配置的是 ceph rgw 存储参数;
  • auth.htpasswd 指定了 HTTP Basic 认证的 token 文件路径;
  • http.tls 指定了 registry http 服务器的证书和秘钥文件路径;

创建 docker registry


$ docker run -d -p 8000:8000\

-v


$(pwd)


/registry/auth/:/auth\

-v


$(pwd)


/registry/certs:/certs\

-v


$(pwd)


/config.yml:/etc/docker/registry/config.yml\

–name registry registry:2

  • 执行该 docker run 命令的机器 IP 为 10.64.3.7;

向 registry push image

将签署 registry 证书的 CA证书拷贝到 /etc/docker/certs.d/10.64.3.7:8000 目录下


$ sudo mkdir -p/etc/docker/certs.d/10.64.3.7:8000

$ sudo cp /etc/kubernetes/ssl/ca.pem/etc/docker/certs.d/10.64.3.7:8000/ca.crt

$

登陆私有 registry


$ docker login 10.64.3.7:8000

Username: foo

Password:

Login Succeeded

登陆信息被写入 ~/.docker/config.json 文件


$ cat


~


/.docker/config.json

{




“auths”


: {




“10.64.3.7:8000”


: {




“auth”


:


“Zm9vOmZvbzEyMw==”



}

}

}

将本地的 image 打上私有 registry 的 tag


$ docker tagdocker.io/kubernetes/pause 10.64.3.7:8000/zhangjun3/pause

$ docker images


|


greppause

docker.io/kubernetes/pause                            latest              f9d5de079539        2 years ago         239.8kB

10.64.3.7:8000/zhangjun3/pause                        latest              f9d5de079539        2 years ago         239.8 kB

将 image push 到私有 registry


$ docker push10.64.3.7:8000/zhangjun3/pause

The push refers to a repository[10.64.3.7:8000/zhangjun3/pause]

5f70bf18a086: Pushed

e16a89738269: Pushed

latest: digest:sha256:9a6b437e896acad3f5a2a8084625fdd4177b2e7124ee943af642259f2f283359 size:916

查看 ceph 上是否已经有 push 的 pause 容器文件


$ radoslspools

rbd

.rgw.root

default.rgw.control

default.rgw.data.root

default.rgw.gc

default.rgw.log

default.rgw.users.uid

default.rgw.users.keys

default.rgw.users.swift

default.rgw.buckets.index

default.rgw.buckets.data


$ rados –pooldefault.rgw.buckets.data ls


|


greppause

9c2d5a9d-19e6-4003-90b5-b1cbf15e890d.4310.1_files/docker/registry/v2/repositories/zhangjun3/pause/_layers/sha256/f9d5de0795395db6c50cb1ac82ebed1bd8eb3eefcebb1aa724e01239594e937b/link

9c2d5a9d-19e6-4003-90b5-b1cbf15e890d.4310.1_files/docker/registry/v2/repositories/zhangjun3/pause/_layers/sha256/f72a00a23f01987b42cb26f259582bb33502bdb0fcf5011e03c60577c4284845/link

9c2d5a9d-19e6-4003-90b5-b1cbf15e890d.4310.1_files/docker/registry/v2/repositories/zhangjun3/pause/_layers/sha256/a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4/link

9c2d5a9d-19e6-4003-90b5-b1cbf15e890d.4310.1_files/docker/registry/v2/repositories/zhangjun3/pause/_manifests/tags/latest/current/link

9c2d5a9d-19e6-4003-90b5-b1cbf15e890d.4310.1_files/docker/registry/v2/repositories/zhangjun3/pause/_manifests/tags/latest/index/sha256/9a6b437e896acad3f5a2a8084625fdd4177b2e7124ee943af642259f2f283359/link

9c2d5a9d-19e6-4003-90b5-b1cbf15e890d.4310.1_files/docker/registry/v2/repositories/zhangjun3/pause/_manifests/revisions/sha256/9a6b437e896acad3f5a2a8084625fdd4177b2e7124ee943af642259f2f283359/link

私有 registry 的运维操作

查询私有镜像中的 images


$ curl –user zhangjun3:xxx –cacert /etc/docker/certs.d/10.64.3.7\:8000/ca.crt



https://10.64.3.7:8000/v2/_catalog




{



“repositories”


:[


“library/redis”


,


“zhangjun3/busybox”


,


“zhangjun3/pause”


,


“zhangjun3/pause2”


]}

查询某个镜像的 tags 列表


$ curl –user zhangjun3:xxx –cacert /etc/docker/certs.d/10.64.3.7\:8000/ca.crt



https://10.64.3.7:8000/v2/zhangjun3/busybox/tags/list




{



“name”


:


“zhangjun3/busybox”


,


“tags”


:[


“latest”


]}

获取 image 或 layer 的 digest

向 v2/<repoName>/manifests/<tagName> 发 GET 请求,从响应的头部 Docker-Content-Digest 获取 image digest,从响应的body 的 fsLayers.blobSum 中获取 layDigests;

注意,必须包含请求头:Accept:application/vnd.docker.distribution.manifest.v2+json:


$ curl -v -H


“Accept:application/vnd.docker.distribution.manifest.v2+json”


–user zhangjun3:xxx –cacert/etc/docker/certs.d/10.64.3.7\:8000/ca.crt



https://10.64.3.7:8000/v2/zhangjun3/busybox/manifests/latest


>


GET /v2/zhangjun3/busybox/manifests/latest HTTP/1.1



>


User-Agent: curl/7.29.0



>


Host: 10.64.3.7:8000



>


Accept:application/vnd.docker.distribution.manifest.v2+json



>





<


HTTP/1.1 200 OK



<


Content-Length: 527



<


Content-Type:application/vnd.docker.distribution.manifest.v2+json



<


Docker-Content-Digest:sha256:68effe31a4ae8312e47f54bec52d1fc925908009ce7e6f734e1b54a4169081c5



<


Docker-Distribution-Api-Version:registry/2.0



<


Etag:


“sha256:68effe31a4ae8312e47f54bec52d1fc925908009ce7e6f734e1b54a4169081c5”





<


X-Content-Type-Options: nosniff



<


Date: Tue, 21 Mar 2017 15:19:42GMT



<



{




“schemaVersion”


: 2,



“mediaType”


:


“application/vnd.docker.distribution.manifest.v2+json”


,



“config”


: {




“mediaType”


:


“application/vnd.docker.container.image.v1+json”


,



“size”


: 1465,



“digest”


:


“sha256:00f017a8c2a6e1fe2ffd05c281f27d069d2a99323a8cd514dd35f228ba26d2ff”



},



“layers”


: [

{




“mediaType”


:


“application/vnd.docker.image.rootfs.diff.tar.gzip”


,



“size”


: 701102,



“digest”


:


“sha256:04176c8b224aa0eb9942af765f66dae866f436e75acef028fe44b8a98e045515”



}

]

}

删除 image

向 /v2/<name>/manifests/<reference> 发送 DELETE 请求,reference为上一步返回的 Docker-Content-Digest 字段内容:


$ curl -X DELETE –user zhangjun3:xxx –cacert /etc/docker/certs.d/10.64.3.7\:8000/ca.crt



https://10.64.3.7:8000/v2/zhangjun3/busybox/manifests/sha256:68effe31a4ae8312e47f54bec52d1fc925908009ce7e6f734e1b54a4169081c5




$

删除 layer

向 /v2/<name>/blobs/<digest>发送 DELETE 请求,其中 digest是上一步返回的 fsLayers.blobSum 字段内容:


$ curl -X DELETE –user zhangjun3:xxx –cacert /etc/docker/certs.d/10.64.3.7\:8000/ca.crt



https://10.64.3.7:8000/v2/zhangjun3/busybox/blobs/sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4




$ curl -X DELETE  –cacert/etc/docker/certs.d/10.64.3.7\:8000/ca.crt



https://10.64.3.7:8000/v2/zhangjun3/busybox/blobs/sha256:04176c8b224aa0eb9942af765f66dae866f436e75acef028fe44b8a98e045515


附件:

config.yml   我挂载的本地路径

version: 0.1

log:

level: info

fromatter: text

fields:

service: registry

storage:

cache:

blobdescriptor: inmemory

delete:

enabled: true


filesystem:


rootdirectory: /var/lib/registry

auth:

htpasswd:

realm: basic-realm

path: /auth/htpasswd

http:

addr: 0.0.0.0:8000

headers:

X-Content-Type-Options: [nosniff]

tls:

certificate: /certs/registry.pem

key: /certs/registry-key.pem

health:

storagedriver:

enabled: true

interval: 10s

threshold: 3




13-部署harbor私有仓库





部署 harbor 私有仓库


本文档介绍使用 docker-compose 部署 harbor私有仓库的步骤,你也可以使用 docker 官方的 registry 镜像部署私有仓库(


部署Docker Registry


)。

使用的变量

本文档用到的变量定义如下:


$


export


NODE_IP=10.64.3.7


# 当前部署harbor 的节点 IP



$

下载文件


从 docker compose



发布页面



下载最新的


docker-compose


二进制文件


$ wget



https://github.com/docker/compose/releases/download/1.12.0/docker-compose-Linux-x86_64




$ mv


~


/docker-compose-Linux-x86_64/root/local/bin/docker-compose

$ chmod a+x /root/local/bin/docker-compose

$


export


PATH=/root/local/bin:$PATH

$


从 harbor


发布页面


下载最新的 harbor 离线安装包


$ wget –continue



https://github.com/vmware/harbor/releases/download/v1.1.0/harbor-offline-installer-v1.1.0.tgz




$ tar -xzvf harbor-offline-installer-v1.1.0.tgz

$


cd


harbor

$

导入 docker images

导入离线安装包中 harbor 相关的 docker images:


$ docker load -i harbor.v1.1.0.tar.gz

$

创建 harbor nginx 服务器使用的 TLS 证书

创建 harbor 证书签名请求:


$ cat


>


harbor-csr.json


<<EOF





{






“CN”: “harbor”,





“hosts”: [





“127.0.0.1”,





“$NODE_IP”





],





“key”: {






“algo”: “rsa”,





“size”: 2048





},





“names”: [





{






“C”: “CN”,





“ST”: “BeiJing”,





“L”: “BeiJing”,





“O”: “k8s”,





“OU”: “System”





}





]





}





EOF


  • hosts 字段指定授权使用该证书的当前部署节点 IP,如果后续使用域名访问 harbor则还需要添加域名;

生成 harbor 证书和私钥:


$ cfssl gencert-ca=/etc/kubernetes/ssl/ca.pem \

-ca-key=/etc/kubernetes/ssl/ca-key.pem\

-config=/etc/kubernetes/ssl/ca-config.json \

-profile=kubernetes harbor-csr.json


|


cfssljson -bare harbor

$ ls harbor


*



harbor.csr  harbor-csr.json  harbor-key.pem harbor.pem

$ sudo mkdir -p /etc/harbor/ssl

$ sudo mv harbor


*


.pem /etc/harbor/ssl

$ rm harbor.csr  harbor-csr.json

修改 harbor.cfg 文件


$ diff harbor.cfg.origharbor.cfg

5c5



<


hostname =reg.mydomain.com





>


hostname = 10.64.3.7

9c9



<


ui_url_protocol = http





>


ui_url_protocol =https

24,25c24,25



<


ssl_cert =/data/cert/server.crt



<


ssl_cert_key =/data/cert/server.key





>


ssl_cert =/etc/harbor/ssl/harbor.pem



>


ssl_cert_key =/etc/harbor/ssl/harbor-key.pem

加载和启动 harbor 镜像

mkdir -p /data


$ ./install.sh

[Step 0]: checking installation environment …


Note: docker version: 17.04.0


Note: docker-compose version: 1.12.0


[Step 1]: loading Harbor images …

Loaded image: vmware/harbor-adminserver:v1.1.0

Loaded image: vmware/harbor-ui:v1.1.0

Loaded image: vmware/harbor-log:v1.1.0

Loaded image: vmware/harbor-jobservice:v1.1.0

Loaded image: vmware/registry:photon-2.6.0

Loaded image: vmware/harbor-notary-db:mariadb-10.1.10

Loaded image: vmware/harbor-db:v1.1.0

Loaded image: vmware/nginx:1.11.5-patched

Loaded image: photon:1.0

Loaded image: vmware/notary-photon:server-0.5.0

Loaded image: vmware/notary-photon:signer-0.5.0


[Step 2]: preparing environment …

Generated and saved secret to file: /data/secretkey

Generated configuration file: ./common/config/nginx/nginx.conf

Generated configuration file: ./common/config/adminserver/env

Generated configuration file: ./common/config/ui/env

Generated configuration file:./common/config/registry/config.yml

Generated configuration file: ./common/config/db/env

Generated configuration file: ./common/config/jobservice/env

Generated configuration file:./common/config/jobservice/app.conf

Generated configuration file: ./common/config/ui/app.conf

Generated certificate, key file: ./common/config/ui/private_key.pem, cert file:./common/config/registry/root.crt

The configuration files are ready, please use docker-compose to start theservice.


[Step 3]: checking existing instance of Harbor …


[Step 4]: starting Harbor…

Creating network


“harbor_harbor”


with the default driver

Creating harbor-log

Creating registry

Creating harbor-adminserver

Creating harbor-db

Creating harbor-ui

Creating harbor-jobservice

Creating nginx


✔ —-Harbor has been installed and startedsuccessfully.—-


Now you should be able to visit theadmin portal at



https://10.64.3.7



.

For more details, please visit



https://github.com/vmware/harbor





.

访问管理界面


浏览器访问



https://${NODE_IP}


,示例的是






https://10.64.3.7


用账号


admin


和 harbor.cfg配置文件中的默认密码


Harbor12345


登陆系统:


harbor 运行时产生的文件、目录


$


# 日志目录



$ ls /var/log/harbor/2017-04-19/

adminserver.log  jobservice.log  mysql.log proxy.log  registry.log  ui.log

$


# 数据目录,包括数据库、镜像仓库



$ ls /data/

ca_download  config  database job_logs registry  secretkey

docker 客户端登陆


将签署 harbor 证书的 CA证书拷贝到


/etc/docker/certs.d/10.64.3.7


目录下


$ sudo mkdir -p /etc/docker/certs.d/10.64.3.7

$ sudo cp /etc/kubernetes/ssl/ca.pem/etc/docker/certs.d/10.64.3.7/ca.crt

$

登陆 harbor


$ docker login 10.64.3.7

Username: admin

Password:


认证信息自动保存到


~/.docker/config.json


文件。

其它操作

下列操作的工作目录均为 解压离线安装文件后 生成的 harbor 目录。


$


# 停止 harbor



$ docker-compose down -v

$


# 修改配置



$ vim harbor.cfg

$


# 更修改的配置更新到docker-compose.yml 文件



[root@tjwq01-sys-bs003007 harbor]


# ./prepare



Clearing the configuration file: ./common/config/ui/app.conf

Clearing the configuration file: ./common/config/ui/env

Clearing the configuration file:./common/config/ui/private_key.pem

Clearing the configuration file: ./common/config/db/env

Clearing the configuration file:./common/config/registry/root.crt

Clearing the configuration file:./common/config/registry/config.yml

Clearing the configuration file:./common/config/jobservice/app.conf

Clearing the configuration file: ./common/config/jobservice/env

Clearing the configuration file:./common/config/nginx/cert/admin.pem

Clearing the configuration file:./common/config/nginx/cert/admin-key.pem

Clearing the configuration file: ./common/config/nginx/nginx.conf

Clearing the configuration file: ./common/config/adminserver/env

loaded secret from file: /data/secretkey

Generated configuration file: ./common/config/nginx/nginx.conf

Generated configuration file: ./common/config/adminserver/env

Generated configuration file: ./common/config/ui/env

Generated configuration file:./common/config/registry/config.yml

Generated configuration file: ./common/config/db/env

Generated configuration file: ./common/config/jobservice/env

Generated configuration file:./common/config/jobservice/app.conf

Generated configuration file: ./common/config/ui/app.conf

Generated certificate, key file: ./common/config/ui/private_key.pem, cert file:./common/config/registry/root.crt

The configuration files are ready, please use docker-compose to start theservice.

$


# 启动 harbor



[root@tjwq01-sys-bs003007 harbor]


# docker-compose up -d

附件:

harbor.cfg

hostname = 192.168.1.206

ui_url_protocol = https

db_password = root123

max_job_workers = 3

customize_crt = on

ssl_cert = /etc/harbor/ssl/harbor.pem

ssl_cert_key = /etc/harbor/ssl/harbor-key.pem

secretkey_path = /data

admiral_url = NA

email_identity =

email_server = smtp.mydomain.com

email_server_port = 25

email_username = sample_admin@mydomain.com

email_password = abc

email_from = admin <sample_admin@mydomain.com>

email_ssl = false

harbor_admin_password = Harbor12345

auth_mode = db_auth

ldap_url = ldaps://ldap.mydomain.com

ldap_basedn = ou=people,dc=mydomain,dc=com

ldap_uid = uid

ldap_scope = 3

ldap_timeout = 5

self_registration = on

token_expiration = 30

project_creation_restriction = everyone

verify_remote_cert = on




14-清理集群



清理集群


清理 Node 节点


停相关进程:


$ sudo systemctl stop kubelet kube-proxy flannelddocker

$

清理文件:


$


# umount kubelet 挂载的目录



$ mount


|


grep


‘/var/lib/kubelet’


|


awk


‘{print $3}’


|


xargs sudo umount

$


# 删除 kubelet 工作目录



$ sudo rm -rf /var/lib/kubelet

$


# 删除 docker 工作目录



$ sudo rm -rf /var/lib/docker

$


# 删除 flanneld 写入的网络配置文件



$ sudo rm -rf /var/run/flannel/

$


# 删除 docker 的一些运行文件



$ sudo rm -rf /var/run/docker/

$


# 删除 systemd unit 文件



$ sudo rm -rf /etc/systemd/system/{kubelet,docker,flanneld}.service

$


# 删除程序文件



$ sudo rm -rf /root/local/bin/{kubelet,docker,flanneld}

$


# 删除证书文件



$ sudo rm -rf /etc/flanneld/ssl /etc/kubernetes/ssl

$

清理 kube-proxy 和 docker 创建的 iptables:


$ sudo iptables -F


&&


sudo iptables -X


&&


sudo iptables -F -t nat


&&


sudo iptables -X -t nat

$

删除 flanneld 和 docker 创建的网桥:


$ ip link del flannel.1

$ ip link del docker0

$

清理 Master 节点

停相关进程:


$ sudo systemctl stop kube-apiserverkube-controller-manager kube-scheduler

$

清理文件:


$


# 删除 kube-apiserver 工作目录



$ sudo rm -rf /var/run/kubernetes

$


# 删除 systemd unit 文件



$ sudo rm -rf/etc/systemd/system/{kube-apiserver,kube-controller-manager,kube-scheduler}.service

$


# 删除程序文件



$ sudo rm -rf/root/local/bin/{kube-apiserver,kube-controller-manager,kube-scheduler}

$


# 删除证书文件



$ sudo rm -rf /etc/flanneld/ssl /etc/kubernetes/ssl

$

清理 etcd 集群

停相关进程:


$ sudo systemctl stop etcd

$

清理文件:


$


# 删除 etcd 的工作目录和数据目录



$ sudo rm -rf /var/lib/etcd

$


# 删除 systemd unit 文件



$ sudo rm -rf /etc/systemd/system/etcd.service

$


# 删除程序文件



$ sudo rm -rf /root/local/bin/etcd

$


# 删除 TLS 证书文件



$ sudo rm -rf /etc/etcd/ssl/


*