Kodekloud Kubernetes Challenge 2 solution | Troubleshoot, fix the Kubernetes cluster issues | Create new PersistentVolume & PersistentVolumeClaim

Ticker

6/recent/ticker-posts

Kodekloud Kubernetes Challenge 2 solution | Troubleshoot, fix the Kubernetes cluster issues | Create new PersistentVolume & PersistentVolumeClaim

Question : This 2-Node Kubernetes cluster is broken! Troubleshoot, fix the cluster issues and then deploy the objects according to the given architecture diagram to unlock our Image Gallery!!

1. Controlplane

  • kubeconfig = /root/.kube/config, User = 'kubernetes-admin' Cluster: Server Port = '6443
  • Fix kube-apiserver. Make sure its running and healthy.
  • Master node: coredns deployment has image: 'k8s.gcr.io/coredns/coredns:v1.8.6

2. node01 is ready and can schedule pods?

3. Copy all images from the directory '/media' on the controlplane node to '/web' directory on node01

4. Data -PV
  • Create new PersistentVolume = 'data-pv'
  • PersistentVolume = data-pv, accessModes = 'ReadWriteMany'
  • PersistentVolume = data-pv, hostPath = '/web'
  • PersistentVolume = data-pv, storage = '1Gi'
5. Data-PVC
  • Create new PersistentVolumeClaim = 'data-pvc'
  • PersistentVolume = 'data-pvc', accessModes = 'ReadWriteMany'
  • PersistentVolume = 'data-pvc', storage request = '1Gi'
  • PersistentVolume = 'data-pvc', volumeName = 'data-pv'
6.  Gop-fileserver
  • Create a pod for fileserver, name: 'gop-fileserver'
  • pod: gop-fileserver image: 'kodekloud/fileserver'
  • pod: gop-fileserver mountPath: '/web'
  • pod: gop-fileserver volumeMount name: 'data-store'
  • pod: gop-fileserver persistent volume name: data-store
  • pod: gop-fileserver persistent volume claim used: 'data-pvc'
7. Gop-fs-service
  • New Service, name: 'gop-fs-service'
  • Service name: gop-fs-service, port: '8080'
  • Service name: gop-fs-service, targetPort: '8080'
Solution: 

1. Troubleshoot contolplane & fix the  Kubernetes cluster issues

root@controlplane ~   kubectl get nodes

The connection to the server controlplane:6433 was refused - did you specify the right host or port?

 root@controlplane ~

  • kubeconfig = /root/.kube/config, User = 'kubernetes-admin' Cluster: Server Port = '6443

root@controlplane ~ cat .kube/config |grep server

    server: https://controlplane:6433

 root@controlplane ~   vi .kube/config

 root@controlplane ~   cat .kube/config |grep server

    server: https://controlplane:6443

 root@controlplane ~  

  • Fix kube-apiserver. Make sure its running and healthy.

root@controlplane ~   cd /var/log/pods

 root@controlplane /var/log/pods

 root@controlplane /var/log/pods   ls -ltr

total 28

drwxr-xr-x 3 root root 4096 Dec 24 07:43 kube-system_kube-controller-manager-controlplane_9fbce1211115f84f542b8c91fb31ce00

drwxr-xr-x 3 root root 4096 Dec 24 07:43 kube-system_etcd-controlplane_be97a386036153051542366141a462b7

drwxr-xr-x 3 root root 4096 Dec 24 07:43 kube-system_kube-scheduler-controlplane_233effdc8fccb749f537f2acea5a7295

drwxr-xr-x 3 root root 4096 Dec 24 07:44 kube-system_kube-proxy-4mdwl_6cf68654-1c3a-4ed4-a439-ee72eb0e8770

drwxr-xr-x 5 root root 4096 Dec 24 07:44 kube-system_weave-net-jr8wj_2e21759b-ed59-4266-b232-642b6fe65a39

drwxr-xr-x 2 root root 4096 Dec 24 08:40 kube-system_coredns-7b945bfcb7-7cw85_16c49fe7-b4de-4c3b-9055-08e0fcb4640f

drwxr-xr-x 3 root root 4096 Dec 24 08:40 kube-system_kube-apiserver-controlplane_079e1f452a2a4e540644498c55816070

 root@controlplane /var/log/pods   cd kube-system_kube-apiserver-controlplane_079e1f452a2a4e540644498c55816070

 root@controlplane log/pods/kube-system_kube-apiserver-controlplane_079e1f452a2a4e540644498c55816070   ls

kube-apiserver

 root@controlplane log/pods/kube-system_kube-apiserver-controlplane_079e1f452a2a4e540644498c55816070   cat kube-apiserver/4.log

{"log":"I1224 08:42:38.560348       1 server.go:565] external host was not specified, using 10.14.42.3\n","stream":"stderr","time":"2022-12-24T08:42:38.560707175Z"}

{"log":"I1224 08:42:38.561221       1 server.go:172] Version: v1.23.0\n","stream":"stderr","time":"2022-12-24T08:42:38.561614097Z"}

{"log":"E1224 08:42:38.887741       1 run.go:120] \"command failed\" err=\"open /etc/kubernetes/pki/ca-authority.crt: no such file or directory\"\n","stream":"stderr","time":"2022-12-24T08:42:38.887999013Z"}

 root@controlplane log/pods/kube-system_kube-apiserver-controlplane_079e1f452a2a4e540644498c55816070  


According to the logs correct certificate name in kube-apiserver.yaml file.

root@controlplane ~ ls -l /etc/kubernetes/pki/*.crt

-rw-r--r-- 1 root root 1289 Dec 24 07:42 /etc/kubernetes/pki/apiserver.crt

-rw-r--r-- 1 root root 1155 Dec 24 07:42 /etc/kubernetes/pki/apiserver-etcd-client.crt

-rw-r--r-- 1 root root 1164 Dec 24 07:42 /etc/kubernetes/pki/apiserver-kubelet-client.crt

-rw-r--r-- 1 root root 1099 Dec 24 07:42 /etc/kubernetes/pki/ca.crt

-rw-r--r-- 1 root root 1115 Dec 24 07:42 /etc/kubernetes/pki/front-proxy-ca.crt

-rw-r--r-- 1 root root 1119 Dec 24 07:42 /etc/kubernetes/pki/front-proxy-client.crt

 root@controlplane ~  

 root@controlplane ~   vi /etc/kubernetes/manifests/kube-apiserver.yaml

 root@controlplane ~   cat /etc/kubernetes/manifests/kube-apiserver.yaml |grep crt

    - --client-ca-file=/etc/kubernetes/pki/ca-authority.crt

    - --etcd-cafile=/etc/kubernetes/pki/etcd/ca.crt

    - --etcd-certfile=/etc/kubernetes/pki/apiserver-etcd-client.crt

    - --kubelet-client-certificate=/etc/kubernetes/pki/apiserver-kubelet-client.crt

    - --proxy-client-cert-file=/etc/kubernetes/pki/front-proxy-client.crt

    - --requestheader-client-ca-file=/etc/kubernetes/pki/front-proxy-ca.crt

    - --tls-cert-file=/etc/kubernetes/pki/apiserver.crt


Once file update make sure to restart the  kubelet daemon 

root@controlplane ~  systemctl restart  kubelet

 root@controlplane ~  systemctl status kubelet

● kubelet.service - kubelet: The Kubernetes Node Agent

   Loaded: loaded (/lib/systemd/system/kubelet.service; enabled; vendor preset: enabled)

  Drop-In: /etc/systemd/system/kubelet.service.d

           └─10-kubeadm.conf

   Active: active (running) since Sat 2022-12-24 08:23:37 UTC; 8s ago

     Docs: https://kubernetes.io/docs/home/

 Main PID: 24621 (kubelet)

    Tasks: 33 (limit: 251382)

   CGroup: /system.slice/kubelet.service

           └─24621 /usr/bin/kubelet --bootstrap-kubeconfig=/etc/kubernetes/bootstrap-kubelet.conf --kubeconfig=/etc/kubernetes/kubelet.conf --config=/var/lib/kubelet/confi

 Dec 24 08:23:44 controlplane kubelet[24621]: I1224 08:23:44.289848   24621 reconciler.go:216] "operationExecutor.VerifyControllerAttachedVolume started for volume \"kube-p

Dec 24 08:23:44 controlplane kubelet[24621]: I1224 08:23:44.289874   24621 reconciler.go:216] "operationExecutor.VerifyControllerAttachedVolume started for volume \"weaved

Dec 24 08:23:44 controlplane kubelet[24621]: I1224 08:23:44.289929   24621 reconciler.go:216] "operationExecutor.VerifyControllerAttachedVolume started for volume \"machin

Dec 24 08:23:44 controlplane kubelet[24621]: I1224 08:23:44.289959   24621 reconciler.go:216] "operationExecutor.VerifyControllerAttachedVolume started for volume \"lib-mo

Dec 24 08:23:44 controlplane kubelet[24621]: I1224 08:23:44.289975   24621 reconciler.go:157] "Reconciler: start to sync state"

Dec 24 08:23:44 controlplane kubelet[24621]: E1224 08:23:44.902326   24621 gcpcredential.go:74] while reading 'google-dockercfg-url' metadata: http status code: 404 while

Dec 24 08:23:45 controlplane kubelet[24621]: E1224 08:23:45.033948   24621 remote_image.go:216] "PullImage from image service failed" err="rpc error: code = Unknown desc =

Dec 24 08:23:45 controlplane kubelet[24621]: E1224 08:23:45.034131   24621 kuberuntime_manager.go:918] container &Container{Name:coredns,Image:k8s.gcr.io/kubedns:1.3.1,Com

Dec 24 08:23:45 controlplane kubelet[24621]: E1224 08:23:45.034175   24621 pod_workers.go:918] "Error syncing pod, skipping" err="failed to \"StartContainer\" for \"coredn

Dec 24 08:23:45 controlplane kubelet[24621]: I1224 08:23:45.704929   24621 scope.go:110] "RemoveContainer" containerID="77b29ebd841d56d6eb8764c9d6d48f115ad6fcdd663852f1054

 

root@controlplane ~   kubectl get pods -A

NAMESPACE     NAME                                   READY   STATUS             RESTARTS       AGE

kube-system   coredns-7b945bfcb7-5ppxl               0/1     ImagePullBackOff   0              8m14s

kube-system   coredns-7b945bfcb7-fxpgc               0/1     ErrImagePull       0              8m14s

kube-system   etcd-controlplane                      1/1     Running            0              41m

kube-system   kube-apiserver-controlplane            1/1     Running            0              41m

kube-system   kube-controller-manager-controlplane   1/1     Running            1 (8m3s ago)   41m

kube-system   kube-proxy-47fdp                       1/1     Running            0              40m

kube-system   kube-proxy-68nd6                       1/1     Running            0              41m

kube-system   kube-scheduler-controlplane            1/1     Running            1 (8m3s ago)   41m

kube-system   weave-net-9xh52                        2/2     Running            1 (40m ago)    41m

kube-system   weave-net-j6wd9                        2/2     Running            0              40m

 root@controlplane ~

  • master node: coredns deployment has image: 'k8s.gcr.io/coredns/coredns:v1.8.6

root@controlplane ~  kubectl describe pods coredns-7b945bfcb7-fxpgc  -n kube-system

Name:                 coredns-7b945bfcb7-fxpgc

Namespace:            kube-system

Priority:             2000000000

Priority Class Name:  system-cluster-critical

Node:                 controlplane/10.14.31.9

Start Time:           Sat, 24 Dec 2022 08:15:46 +0000

Labels:               k8s-app=kube-dns

                      pod-template-hash=7b945bfcb7

Annotations:          kubectl.kubernetes.io/restartedAt: 2022-05-17T05:37:09Z

Status:               Pending

IP:                   10.50.0.4

IPs:

  IP:           10.50.0.4

Controlled By:  ReplicaSet/coredns-7b945bfcb7

Containers:

  coredns:

    Container ID: 

    Image:         k8s.gcr.io/kubedns:1.3.1

    Image ID:     

    Ports:         53/UDP, 53/TCP, 9153/TCP

    Host Ports:    0/UDP, 0/TCP, 0/TCP

    Args:

      -conf

      /etc/coredns/Corefile

    State:          Waiting

      Reason:       ErrImagePull

    Ready:          False

    Restart Count:  0

    Limits:

      memory:  170Mi

    Requests:

      cpu:        100m

      memory:     70Mi

    Liveness:     http-get http://:8080/health delay=60s timeout=5s period=10s #success=1 #failure=5

    Readiness:    http-get http://:8181/ready delay=0s timeout=1s period=10s #success=1 #failure=3

    Environment:  <none>

    Mounts:

      /etc/coredns from config-volume (ro)

      /var/run/secrets/kubernetes.io/serviceaccount from kube-api-access-84zk6 (ro)

Conditions:

  Type              Status

  Initialized       True

  Ready             False

  ContainersReady   False

  PodScheduled      True

Volumes:

  config-volume:

    Type:      ConfigMap (a volume populated by a ConfigMap)

    Name:      coredns

    Optional:  false

  kube-api-access-84zk6:

    Type:                    Projected (a volume that contains injected data from multiple sources)

    TokenExpirationSeconds:  3607

    ConfigMapName:           kube-root-ca.crt

    ConfigMapOptional:       <nil>

    DownwardAPI:             true

QoS Class:                   Burstable

Node-Selectors:              kubernetes.io/os=linux

Tolerations:                 CriticalAddonsOnly op=Exists

                             node-role.kubernetes.io/control-plane:NoSchedule

                             node-role.kubernetes.io/master:NoSchedule

                             node.kubernetes.io/not-ready:NoExecute op=Exists for 300s

                             node.kubernetes.io/unreachable:NoExecute op=Exists for 300s

Events:

  Type     Reason     Age                  From               Message

  ----     ------     ----                 ----               -------

  Normal   Scheduled  11m                  default-scheduler  Successfully assigned kube-system/coredns-7b945bfcb7-fxpgc to controlplane

  Normal   Pulling    106s (x4 over 3m3s)  kubelet            Pulling image "k8s.gcr.io/kubedns:1.3.1"

  Warning  Failed     106s (x4 over 3m2s)  kubelet            Failed to pull image "k8s.gcr.io/kubedns:1.3.1": rpc error: code = Unknown desc = Error response from daemon: manifest for k8s.gcr.io/kubedns:1.3.1 not found: manifest unknown: Failed to fetch "1.3.1" from request "/v2/kubedns/manifests/1.3.1".

  Warning  Failed     106s (x4 over 3m2s)  kubelet            Error: ErrImagePull

  Warning  Failed     69s (x6 over 2m38s)  kubelet            Error: ImagePullBackOff

  Normal   BackOff    57s (x7 over 2m38s)  kubelet            Back-off pulling image "k8s.gcr.io/kubedns:1.3.1"

 

root@controlplane ~

root@controlplane ~  kubectl set image deployment/coredns -n kube-system \

>     coredns=k8s.gcr.io/coredns/coredns:v1.8.6

deployment.apps/coredns image updated

 

root@controlplane ~   kubectl get pods -A

NAMESPACE     NAME                                   READY   STATUS    RESTARTS      AGE

kube-system   coredns-98c786496-b8jnw                1/1     Running   0             9s

kube-system   coredns-98c786496-tq9mv                1/1     Running   0             9s

kube-system   etcd-controlplane                      1/1     Running   0             45m

kube-system   kube-apiserver-controlplane            1/1     Running   0             45m

kube-system   kube-controller-manager-controlplane   1/1     Running   1 (11m ago)   45m

kube-system   kube-proxy-47fdp                       1/1     Running   0             44m

kube-system   kube-proxy-68nd6                       1/1     Running   0             44m

kube-system   kube-scheduler-controlplane            1/1     Running   1 (11m ago)   45m

kube-system   weave-net-9xh52                        2/2     Running   1 (44m ago)   44m

kube-system   weave-net-j6wd9                        2/2     Running   0             44m

 root@controlplane ~  


2. node01 is ready and can schedule pods?

root@controlplane ~   kubectl get nodes

NAME           STATUS                     ROLES                  AGE   VERSION

controlplane   Ready                      control-plane,master   45m   v1.23.0

node01         Ready,SchedulingDisabled   <none>                 44m   v1.23.0

 root@controlplane ~  

 root@controlplane ~   kubectl get nodes

NAME           STATUS   ROLES                  AGE   VERSION

controlplane   Ready    control-plane,master   45m   v1.23.0

node01         Ready    <none>                 44m   v1.23.0

 root@controlplane ~


3. Copy all images from the directory '/media' on the controlplane node to '/web' directory on node01

root@controlplane ~   ls /media/

kodekloud-ckad.png  kodekloud-cka.png  kodekloud-cks.png

 root@controlplane ~   scp /media/* node01:/web

kodekloud-ckad.png                                                                                                                       100%   58KB  59.2MB/s   00:00   

kodekloud-cka.png                                                                                                                        100%   57KB  73.4MB/s   00:00   

kodekloud-cks.png                                                                                                                        100%   61KB  76.9MB/s   00:00   

 root@controlplane ~


4. Created  YAML  manifest files with all the parameters, Kindly clone repo or you can copy from GitLab 

git clone https://gitlab.com/nb-tech-support/devops.git

 Refer Below Video for more clarity )

5. Create new PersistentVolume = 'data-pv' & validate the status 

root@controlplane ~   kubectl get pv -A

No resources found

 root@controlplane ~   kubectl apply -f  devops/kubernetes-challenges/challenge-2/fileserver-pv.yaml

persistentvolume/data-pv created

 root@controlplane ~   kubectl get pv -A

NAME      CAPACITY   ACCESS MODES   RECLAIM POLICY   STATUS      CLAIM   STORAGECLASS   REASON   AGE

data-pv   1Gi        RWX            Retain           Available                                   2s

 root@controlplane ~


6. Create new PersistentVolumeClaim = 'data-pvc' & validate the status 

root@controlplane ~   kubectl get pvc  -A

No resources found in default namespace.

 root@controlplane ~   kubectl apply -f  devops/kubernetes-challenges/challenge-2/fileserver-pvc.yaml

persistentvolumeclaim/data-pvc created

 root@controlplane ~   kubectl get pvc  -A

NAME       STATUS    VOLUME    CAPACITY   ACCESS MODES   STORAGECLASS   AGE

data-pvc   Pending   data-pv   0                                        3s

root@controlplane ~


7. Create a pod for fileserver, name: 'gop-fileserver'  & validate the status 

root@controlplane ~   kubectl apply -f  devops/kubernetes-challenges/challenge-2/fileserver-pod.yaml

pod/gop-fileserver created

 root@controlplane ~   kubectl get pod

NAME             READY   STATUS    RESTARTS   AGE

gop-fileserver   1/1     Running   0          7s

 root@controlplane ~


8. Create new Service, name: 'gop-fs-service & validate the status 

root@controlplane ~   kubectl apply -f  devops/kubernetes-challenges/challenge-2/fileserver-svc.yaml

service/gop-fs-service created

 root@controlplane ~   kubectl get svc

NAME             TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)          AGE

gop-fs-service   NodePort    10.106.106.55   <none>        8080:31200/TCP   11s

kubernetes       ClusterIP   10.96.0.1       <none>        443/TCP          48m

 root@controlplane ~  


9. Click on Check & Confirm to complete the task successfully


Happy Learning!!!!


Apart from this if you need more clarity,  I have made a  tutorial video on this , please go through and share your comments. Like and share the knowledge



Post a Comment

0 Comments

Latest Posts