노드 스케줄 및 관리

컨테이너를 포함한 파드는 노드에 실행됨
노드는 control-plane에 의해 관리
특정 노드의 스케줄링 중단(cordon) 및 허용(uncordon)

노드 스케줄 (cordon, uncordon)

kubectl cordon [node]

기존 파드는 유지하지만 앞으로 실행될 파드를 해당 [node]에는 할당하지 않는다.

kubectl uncordon [node]

앞으로 실행될 파드를 [node]에 할당

# node2 cordon
kubectl cordon node2
node/node2 cordoned

# cordon 확인
kubectl get nodes
NAME     STATUS                     ROLES           AGE   VERSION
master   Ready                      control-plane   41h   v1.26.1
node1    Ready                      <none>          41h   v1.26.1
node2    Ready,SchedulingDisabled   <none>          41h   v1.26.1

# vi deploy-nginx.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: deploy-nginx
spec:
  replicas: 4
  selector:
    matchLabels:
      app: webui
  template:
    metadata:
      labels:
        app: webui
    spec:
      containers:
      - name: nginx-container
        image: nginx:1.14

# 디플로이먼트 생성
kubectl apply -f deploy-nginx.yaml

# 모든 파드가 node1에서 실행
kubectl get pods -o wide

# uncordon
kubectl uncordon node2

# 노드 상태 확인
kubectl get nodes
...
NAME     STATUS   ROLES           AGE   VERSION
master   Ready    control-plane   41h   v1.26.1
node1    Ready    <none>          41h   v1.26.1
node2    Ready    <none>          41h   v1.26.1

# 파드 임의 삭제
kubectl delete pod deploy-nginx-758f56d7-zh9mh
kubectl delete pod deploy-nginx-758f56d7-mrmnx
NAME                          READY   STATUS    RESTARTS      AGE     IP                NODE    NOMINATED NODE   READINESS GATES
campus-01                     1/1     Running   2 (19h ago)   40h     192.168.166.150   node1   <none>           <none>
custom-app                    1/1     Running   2 (19h ago)   40h     192.168.166.152   node1   <none>           <none>
deploy-nginx-758f56d7-28lv6   1/1     Running   0             18m     192.168.166.185   node1   <none>           <none>
deploy-nginx-758f56d7-5x6lh   1/1     Running   0             41s     192.168.166.132   node1   <none>           <none>
deploy-nginx-758f56d7-965zf   1/1     Running   0             41s     192.168.166.130   node1   <none>           <none>
deploy-nginx-758f56d7-9ggsk   1/1     Running   0             41s     192.168.166.136   node1   <none>           <none>
deploy-nginx-758f56d7-gqf4x   1/1     Running   0             7m19s   192.168.166.134   node1   <none>           <none>
deploy-nginx-758f56d7-k9hjn   1/1     Running   0             18m     192.168.166.187   node1   <none>           <none>
deploy-nginx-758f56d7-v89jr   1/1     Running   0             7m19s   192.168.166.128   node1   <none>           <none>
deploy-nginx-758f56d7-wjfwg   1/1     Running   0             41s     192.168.166.131   node1   <none>           <none>
eshop-cart-app                2/2     Running   0             155m    192.168.166.160   node1   <none>           <none>
front-end-6f49fd5bf9-dws4k    1/1     Running   0             19h     192.168.166.147   node1   <none>           <none>
front-end-6f49fd5bf9-vpcgr    1/1     Running   0             7m19s   192.168.166.191   node1   <none>           <none>
nginx-static-pod-node1        1/1     Running   0             3h33m   192.168.166.158   node1   <none>           <none>

# node2에 파드 생성 확인
kubectl get pods -o wide

노드 비우기 (drain)

특정 노드에서 실행중인 파드 비우기(drain) 및 제거

cordon의 경우 현재 실행되고 있는 파드는 유지하고 앞으로 생성될 파드를 해당 노드에 할당받지 않지만 drain은 현재 실행되고 있는 파드를 모두 다른 노드로 옮기고 앞으로 생성되는 파드 또한 할당 받지 않는다.

kubectl drain [node] --ignore-daemonsets --force

--ignore-daemonsets: Daemonset-manager 파드들은 무시 --force=false: RC, RS, Job, DaemonSet 또는 StatefulSet에서 관리하지 않는 Pod까지 제거

# node2 비우기
kubectl drain node2 --ignore-daemonsets --force

# 파드 확인 / node2에서 실행되고 있는 파드가 node1에서 실행됨
kubectl get pods -o wide
...
NAME                          READY   STATUS    RESTARTS      AGE     IP                NODE    NOMINATED NODE   READINESS GATES
campus-01                     1/1     Running   2 (19h ago)   40h     192.168.166.150   node1   <none>           <none>
custom-app                    1/1     Running   2 (19h ago)   40h     192.168.166.152   node1   <none>           <none>
deploy-nginx-758f56d7-28lv6   1/1     Running   0             11m     192.168.166.185   node1   <none>           <none>
deploy-nginx-758f56d7-gqf4x   1/1     Running   0             38s     192.168.166.134   node1   <none>           <none>
deploy-nginx-758f56d7-k9hjn   1/1     Running   0             11m     192.168.166.187   node1   <none>           <none>
deploy-nginx-758f56d7-v89jr   1/1     Running   0             38s     192.168.166.128   node1   <none>           <none>
eshop-cart-app                2/2     Running   0             148m    192.168.166.160   node1   <none>           <none>
front-end-6f49fd5bf9-dws4k    1/1     Running   0             19h     192.168.166.147   node1   <none>           <none>
front-end-6f49fd5bf9-vpcgr    1/1     Running   0             38s     192.168.166.191   node1   <none>           <none>
nginx-static-pod-node1        1/1     Running   0             3h26m   192.168.166.158   node1   <none>           <none>

# 노드 상태 확인
kubectl get nodes
...
NAME     STATUS                     ROLES           AGE   VERSION
master   Ready                      control-plane   41h   v1.26.1
node1    Ready                      <none>          41h   v1.26.1
node2    Ready,SchedulingDisabled   <none>          41h   v1.26.1

# uncordon
kubectl uncordon node2

kubectl get nodes
...
NAME     STATUS   ROLES           AGE   VERSION
master   Ready    control-plane   41h   v1.26.1
node1    Ready    <none>          41h   v1.26.1
node2    Ready    <none>          41h   v1.26.1

note

node2 노드를 스케줄링 불가능하게 설정하고 해당 노드에서 실행중인 모든 파드를 다른 노드로 리스케줄하세요.

kubectl get deployments.apps deploy-nginx
kubectl scale deployment deploy-nginx --replicas=8

# node2에서 실행중인 deploy-nginx의 파드들을 node1으로 리스케줄하기 위해 노드 비우기
kubectl drain node2 --ignore-daemonsets --force

# node2 상태 확인
kubectl get nodes

# node1에서 실행되는 파드 확인
kubectl get pods -o wide
...
NAME                          READY   STATUS    RESTARTS      AGE     IP                NODE    NOMINATED NODE   READINESS GATES
campus-01                     1/1     Running   2 (19h ago)   40h     192.168.166.150   node1   <none>           <none>
custom-app                    1/1     Running   2 (19h ago)   40h     192.168.166.152   node1   <none>           <none>
deploy-nginx-758f56d7-28lv6   1/1     Running   0             18m     192.168.166.185   node1   <none>           <none>
deploy-nginx-758f56d7-5x6lh   1/1     Running   0             41s     192.168.166.132   node1   <none>           <none>
deploy-nginx-758f56d7-965zf   1/1     Running   0             41s     192.168.166.130   node1   <none>           <none>
deploy-nginx-758f56d7-9ggsk   1/1     Running   0             41s     192.168.166.136   node1   <none>           <none>
deploy-nginx-758f56d7-gqf4x   1/1     Running   0             7m19s   192.168.166.134   node1   <none>           <none>
deploy-nginx-758f56d7-k9hjn   1/1     Running   0             18m     192.168.166.187   node1   <none>           <none>
deploy-nginx-758f56d7-v89jr   1/1     Running   0             7m19s   192.168.166.128   node1   <none>           <none>
deploy-nginx-758f56d7-wjfwg   1/1     Running   0             41s     192.168.166.131   node1   <none>           <none>
eshop-cart-app                2/2     Running   0             155m    192.168.166.160   node1   <none>           <none>
front-end-6f49fd5bf9-dws4k    1/1     Running   0             19h     192.168.166.147   node1   <none>           <none>
front-end-6f49fd5bf9-vpcgr    1/1     Running   0             7m19s   192.168.166.191   node1   <none>           <none>
nginx-static-pod-node1        1/1     Running   0             3h33m   192.168.166.158   node1   <none>           <none>


# uncordon
kubectl uncordon node2

node taint, Pod toleration

worker node에 taint가 설정된 경우 동일 값의 toleration이 있는 Pod만 배치된다.
toleration이 있는 Pod는 동일한 taint가 있는 node를 포함하여 모든 node에 배치된다.

쿠버네티스 클러스터의 특정 노드에 테인트(Taint)를 설정할 수 있습니다. Node Affinity는 Pod가 특정 노드를 선택하게 하는 Pod의 속성이었다면 테인트(Taint)는 반대로 노드가 파드 셋을 제외하도록 하는 노드에 설정하는 속성입니다.

기본적으로 테인트를 설정한 노드는 파드들을 스케줄링하지 않습니다. 테인트를 설정한 노드에 파드들을 스케줄링하려면 파드에 톨러레이션(Toleration)을 설정해야 합니다. 그럼 테인트는 톨러레이션에서 설정한 특정 파드들만 실행하고 다른 파드는 실행하지 못하게 합니다.

테인트와 톨러레이션은 주로 노드를 특정 역할만 하도록 만들 때 사용합니다. 예를 들어 데이터베이스용 파드를 실행한 후 노드 전체의 CPU나 RAM 자원을 독점해서 사용할 수 있도록 설정하는 것입니다. GPU가 있는 노드에는 실제로 GPU 자원을 사용하는 파드들만 실행되도록 설정할 수도 있습니다.

taint example

kubectl taint nodes [node명] <key>=<value>:<effect>

# key1=value가 일치하는 톨러레이션이 없으면 파드를 node1에 스케줄 할 수 없음
kubectl taint nodes node1 key1=value1:NoSchedule

# 적용한 테인트 삭제
kubectl taint nodes node1 key1=value1:NoSchedule-

effect 필드 값에는 NoSchedule, PreferNoSchedule, NoExecute 등이 있습니다.

NoSchedule: 톨러레이션 설정이 없으면 파드를 스케줄링하지 않습니다. 기존에 실행되던 파드에는 적용되지 않습니다.
PreferNoSchedule: 톨러레이션 설정이 없으면 파드를 스케줄링하지 않습니다. 하지만 클러스터안 자원이 부족하면 테인트를 설정한 노드에서도 파드를 스케줄링할 수 있습니다.
NoExecute: 톨러레이션 설정이 없으면 새로운 파드를 스케줄링하지 않으며, 기존 파드도 톨러레이션 설정이 없으면 종료시킵니다.

톨러레이션

톨러레이션(Toleration)은 PodSpec에서 지정할 수 있습니다. toleration 필드 하위의 key, value, effect 필드 값을 원하는 테인트의 설정 값을 넣습니다. operator 필드 값에는 Equal과 Exists가 있습니다.

Equal은 key, value, effect 필드 값이 테인트의 설정값과 모두 같은지 확인합니다. Exists는 앞 세가지 필드를 선별해서 사용할 때 설정합니다. (operator가 Exists인 경우에는 value 필드 사용 불가)

# 애플리케이션 파드를 동작시키게 되면 TAINT와 매치되는
# toleration이 없는 파드는 노스케줄이 있는 노드에 배치될 수 없다.
# taint 확인
kubectl describe node master | grep -i taint

# 이 설정으로 인해 기본 값으로 구성하는 모든 파드는 컨트롤플레인에 배치되지 않는다.
# 톨러레이션을 사용해서 마스터 노드를 용인하면 마스터 노드를 포함한 모든 노드에 파드가 배포된다.
Taints:             node-role.kubernetes.io/control-plane:NoSchedule

kubectl describe node node1 | grep 'Taints'
Taints:             <none>

kubectl describe node node2 | grep 'Taints'
Taints:             <none>


# 디플로이먼트 yaml 파일 생성
kubectl create deployment testdep --image=nginx --replicas=5 --dry-run=client -o yaml > testdep.yaml

# vi testdep.yaml
apiVersion: apps/v1
kind: Deployment
metadata:
  name: testdep
spec:
  replicas: 5
  selector:
    matchLabels:
      app: testdep
  template:
    metadata:
      labels:
        app: testdep
    spec:
      tolerations: # add
      - key: "node-role.kubernetes.io/control-plane"
        operator: "Equal"
        effect: "NoSchedule"
      containers:
      - image: nginx
        name: nginx

# 마스터 노드에 파드 생성 확인
kubectl get pods -o wide | grep -i testdep
...
testdep-5dc784db9f-9dtzg      1/1     Running   0             42s     192.168.166.141   node1    <none>           <none>
testdep-5dc784db9f-ckh5b      1/1     Running   0             42s     192.168.104.39    node2    <none>           <none>
testdep-5dc784db9f-dj76q      1/1     Running   0             42s     192.168.104.40    node2    <none>           <none>
testdep-5dc784db9f-h2rkh      1/1     Running   0             42s     192.168.104.41    node2    <none>           <none>
testdep-5dc784db9f-knnb5      1/1     Running   0             42s     192.168.219.76    master   <none>           <none>

note

Ready 상태(NoSchedule로 Taint된 node는 제외)인 node를 찾아 그 수를 /var/CKA2022/notaint_ready_node에 기록

kubectl config current-context
kubectl config use-context hk8s

# ready: 3
kubectl get nodes
...
NAME     STATUS   ROLES           AGE     VERSION
master   Ready    control-plane   2d13h   v1.26.1
node1    Ready    <none>          2d13h   v1.26.1
node2    Ready    <none>          2d13h   v1.26.1

# 각 노드 taint 확인 / master
kubectl describe node master | grep -i -e taint -e noschedule
...
Taints:             node-role.kubernetes.io/control-plane:NoSchedule

# node1
kubectl describe node node1 | grep -i -e taint -e noschedule
...
Taints:             <none>

# node2
kubectl describe node node2 | grep -i -e taint -e noschedule
...
Taints:             <none>

sudo -i
# node1과 node2 총 2개
echo "2" > /var/CKA2022/totaint_ready_node

파드 스케줄링: NodeSelector

worker node에 할당된 label을 이용해 node를 선택
node label 설정
- kubectl label node [node] [label명]=[value]
- kubectl label node node1 gpu=true
- kubectl get nodes -L gpu

# label 확인
kubectl get nodes --show-labels

# node2 label gpu=true 설정
kubectl label node node2 gpu=true

# label GPU 확인
kubectl get nodes -L gpu
...
NAME     STATUS   ROLES           AGE   VERSION   GPU
master   Ready    control-plane   42h   v1.26.1
node1    Ready    <none>          42h   v1.26.1   true
node2    Ready    <none>          42h   v1.26.1   true

# node1 gpu=false 변경
kubectl label node node1 gpu=false --overwrite

# gpu=false 변경 확인
kubectl get nodes -L gpu
...
NAME     STATUS   ROLES           AGE   VERSION   GPU
master   Ready    control-plane   42h   v1.26.1
node1    Ready    <none>          42h   v1.26.1   false
node2    Ready    <none>          42h   v1.26.1   true

# node1 gpu 삭제
kubectl label node node1 gpu-

# node1 gpu 삭제 확인
kubectl get nodes -L gpu
...
NAME     STATUS   ROLES           AGE   VERSION   GPU
master   Ready    control-plane   42h   v1.26.1
node1    Ready    <none>          42h   v1.26.1
node2    Ready    <none>          42h   v1.26.1   true

# vi pod-tensorflow.yaml
apiVersion: v1
kind: Pod
metadata:
  name: tensorflow-gpu
spec:
  nodeSelector:
    gpu: "true" # gpu=true가 설정된 노드에서 Pod 실행
  containers:
  - name: tensorflow
    image: tensorflow/tensorflow:night;y-jupyter
    ports:
    - containerPort: 8888
      protocol: TCP

# Pod 생성
kubectl apply -f pod-tensorflow.yaml

# gpu가 설정된 node2에 Pod가 생성되는지 확인
kubectl get pods -o wide | grep -i tensorflow-gpu

note

다음의 조건으로 파드를 생성하세요. Image: nginx NodeSelector: disktype=ssd

# disktype label 확인
kubectl get nodes -L disktype
...
kubectl get nodes -L disktype
NAME     STATUS   ROLES           AGE   VERSION   DISKTYPE
master   Ready    control-plane   42h   v1.26.1
node1    Ready    <none>          42h   v1.26.1   ssd
node2    Ready    <none>          42h   v1.26.1   std

# 위에서 보듯이 node1에서만 disktype이 ssd로 설정되어 있으므로
# 파드가 생성된다면 node1에서만 파드가 생성되어야 한다.

# 파드 yaml 파일 생성
kubectl run eshop-store --image=nginx --dry-run=client -o yaml
kubectl run eshop-store --image=nginx --dry-run=client -o yaml > eshop.yaml

# vi eshop.yaml
apiVersion: v1
kind: Pod
metadata:
  name: eshop-store
spec:
  nodeSelector:
    disktype: "ssd"
  containers:
  - image: nginx
    name: eshop-store

# disktype이 ssd인 node1에서 pod가 실행되는지 확인
kubectl get pods -o wide | grep -i eshop-store

https://kubernetes.io/ko/docs/concepts/scheduling-eviction/taint-and-toleration/

노드 스케줄 (cordon, uncordon)​

노드 비우기 (drain)​

node taint, Pod toleration​

taint example​

톨러레이션​

파드 스케줄링: NodeSelector​

node label 설정​

노드 스케줄 (cordon, uncordon)

노드 비우기 (drain)

node taint, Pod toleration

taint example

톨러레이션

파드 스케줄링: NodeSelector

node label 설정