aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
aster.cloud aster.cloud
  • /
  • Platforms
    • Public Cloud
    • On-Premise
    • Hybrid Cloud
    • Data
  • Architecture
    • Design
    • Solutions
    • Enterprise
  • Engineering
    • Automation
    • Software Engineering
    • Project Management
    • DevOps
  • Programming
    • Learning
  • Tools
  • About
  • Platforms

Decoding The Self-Healing Kubernetes: Step By Step

  • aster.cloud
  • May 28, 2020
  • 5 minute read

Business application that fails to operate 24/7 would be considered inefficient in the market. The idea is that applications run uninterrupted irrespective of a technical glitch, feature update, or a natural disaster. In today’s heterogeneous environment where infrastructure is intricately layered, a continuous workflow of application is possible via self-healing.

Captain America asked Bruce Banner in Avengers to get angry to transform into ‘The Hulk’. Bruce replied, “That’s my secret Captain. I’m always angry.”You must have understood the analogy here. Let’s simplify – Kubernetes will self-heal organically, whenever the system is affected.

Kubernetes, which is a container orchestration tool, facilitates the smooth working of the application by abstracting machines physically. Moreover, the pods and containers in Kubernetes can self-heal.


Partner with aster.cloud
for your next big idea.
Let us know here.



From our partners:

CITI.IO :: Business. Institutions. Society. Global Political Economy.
CYBERPOGO.COM :: For the Arts, Sciences, and Technology.
DADAHACKS.COM :: Parenting For The Rest Of Us.
ZEDISTA.COM :: Entertainment. Sports. Culture. Escape.
TAKUMAKU.COM :: For The Hearth And Home.
ASTER.CLOUD :: From The Cloud And Beyond.
LIWAIWAI.COM :: Intelligence, Inside and Outside.
GLOBALCLOUDPLATFORMS.COM :: For The World's Computing Needs.
FIREGULAMAN.COM :: For The Fire In The Belly Of The Coder.
ASTERCASTER.COM :: Supra Astra. Beyond The Stars.
BARTDAY.COM :: Prosperity For Everyone.

Kubernetes’s self-healing property ensures that the clusters always function at the optimal state. Kubernetes can self-detect two types of object – podstatus and containerstatus. Kubernetes’s orchestration capabilities can monitor and replace unhealthy container as per the desired configuration. Likewise, Kubernetes can fix pods, which are the smallest units encompassing single or multiple containers.

The three container states include

01. Waiting – created but not running. A container, which is in a waiting stage, will still run operations like pulling images or applying secrets, etc. To check the Waiting pod status, use the below command.

kubectl describe pod [POD_NAME]

Along with this state, a message and reason about the state are displayed to provide more information.

... 
State: Waiting 
Reason: ErrImagePull 
...

02. Running Pods – containers that are running without issues. The following command is executed before the pod enters the Running state.

postStart

Running pods will display the time of the entrance of the container.

...   
State:          Running    
Started:      Wed, 30 Jan 2019 16:46:38 +0530 
...

03. Terminated Pods – container, which fails or completes its execution; stands terminated. The following command is executed before the pod is moved to Terminated.

prestop

Terminated pods will display the time of the entrance of the container.

…
State:          Terminated
Reason:       Completed
Exit Code:    0
Started:      Wed, 30 Jan 2019 11:45:26 +0530
Finished:     Wed, 30 Jan 2019 11:45:26 +0530
…

Kubernetes’ self-healing Concepts – pod’s phase, probes, and restart policy.

The pod phase in Kubernetes offers insight into the pod’s placement. We can have

  • Pending Pods – created but not running
  • Running Pods – runs all the containers
  • Succeeded Pods – successfully completed container lifecycle
  • Failed Pods – minimum one container failed and all container terminated
  • Unknown Pods
Read More  Google Cloud Next 2019 | NASA FDL + Google Cloud: Identifying Exoplanets - Finding Life

Kubernetes execute liveliness and readiness probes for the Pods to check if they function as per the desired state. The liveliness probe will check a container for its running status. If a container fails the probe, Kubernetes will terminate it and create a new container in accordance with the restart policy. The readiness probe will check a container for its service request serving capabilities. If a container fails the probe, then Kubernetes will remove the IP address of the related pod.

Liveliness probe example.

apiVersion: v1
kind: Pod
metadata:
  labels:
    test: liveness
  name: liveness-http
spec:
  containers:
  - args:
    - /server
    image: k8s.gcr.io/liveness
    livenessProbe:
      httpGet:
        # when "host" is not defined, "PodIP" will be used
        # host: my-host
        # when "scheme" is not defined, "HTTP" scheme will be used. Only "HTTP" and "HTTPS" are allowed
        # scheme: HTTPS
        path: /healthz
        port: 8080
        httpHeaders:
        - name: X-Custom-Header
          value: Awesome
      initialDelaySeconds: 15
      timeoutSeconds: 1
    name: liveness

The probes include

  • ExecAction – to execute commands in containers.
  • TCPSocketAction – to implement a TCP check w.r.t to the IP address of a container.
  • HTTPGetAction – to implement a HTTP Get check w.r.t to the IP address of a container.

Each probe gives one of three results:

  • Success: The Container passed the diagnostic.
  • Failure: The Container failed the diagnostic.
  • Unknown: The diagnostic failed, so no action should be taken.

Demo description of Self-Healing Kubernetes – Example 1

We need to set the code replication to trigger the self-healing capability of Kubernetes.

Let’s see an example of the Nginx file.

apiVersion: apps/v1 
kind: Deployment
metadata:
  name: nginx-deployment-sample
spec:
  selector:
    matchLabels:
      app: nginx
  replicas:4
  template:
    metadata:
      labels:
        app: nginx
    spec:
      containers:
      - name: nginx
        image: nginx:1.14.2
        ports:
        - containerPort: 80

In the above code, we see that the total number of pods across the cluster must be 4.

Read More  LF Edge’s Akraino Project Release 3 Now Available, Unifying Open Source Blueprints Across MEC, AI, Cloud and Telecom Edge

Let’s now deploy the file.

kubectl apply nginx-deployment-sample

Let’s list the pods, using

kubectl get pods -l app=nginx

Here is the output.

As you see above, we have created 4 pods.

Let’s delete one of the pods.

kubectl delete nginx-deployment-test-83586599-r299i

The pod is now deleted. We get the following output

pod "deployment nginx-deployment-test-83586599-r299i" deleted

Now again, list the pods.

kubectl get pods -l app=nginx

We get the following output.

We have 4 pods again, despite deleting one.

Kubernetes has self-healed to create a new node and maintain the count to 4.

Demo description of Self-Healing Kubernetes – Example 2

Get pod details

$ kubectl get pods -o wide

Get first nginx pod and delete it – one of the nginx pods should be in ‘Terminating’ status

$ NGINX_POD=$(kubectl get pods -l app=nginx --output=jsonpath="{.items[0].metadata.name}")
$ kubectl delete pod $NGINX_POD; kubectl get pods -l app=nginx -o wide
$ sleep 10

Get pod details – one nginx pod should be freshly started

$ kubectl get pods -l app=nginx -o wide

Get deployment details and check the events for recent changes

$ kubectl describe deployment nginx-deployment

Halt one of the nodes (node2)

$ vagrant halt node2$ sleep 30

Get node details – node2 Status=NotReady

$ kubectl get nodes

Get pod details – everything looks fine – you need to wait 5 minutes

$ kubectl get pods -o wide

Pod will not be evicted until it is 5 minutes old – (see Tolerations in ‘describe pod’ ). It prevents Kubernetes to spin up the new containers when it is not necessary

$ NGINX_POD=$(kubectl get pods -l app=nginx --
output=jsonpath="{.items[0].metadata.name}")
$ kubectl describe pod $NGINX_POD | grep -A1 Tolerations

Sleeping for 5 minutes

$ sleep 300

Get pods details – Status=Unknown/NodeLost and new container was started

$ kubectl get pods -o wide

Get depoyment details – again AVAILABLE=3/3

$ kubectl get deployments -o wide

Power on the node2 node

$ vagrant up node2
$ sleep 70

Get node details – node2 should be Ready again

$ kubectl get nodes

Get pods details – ‘Unknown’ pods were removed

$ kubectl get pods -o wide

**Source: GitHub. Author: Petr Ruzicka

Conclusion

Kubernetes can self-heal applications and containers, but what about healing itself when the nodes are down? For Kubernetes to continue self-healing, it needs a dedicated set of infrastructure, with access to self-healing nodes all the time. The infrastructure must be driven by automation and powered by predictive analytics to preempt and fix issues beforehand. The bottom line is that at any given point in time, the infrastructure nodes should maintain the required count for uninterrupted services.

Read More  Announcing The General Availability Of VMware Tanzu Kubernetes Grid 1.5

 

Guest post originally published on the Msys Technology blog by Atul Jadhav

Source: CNCF blog 


For enquiries, product placements, sponsorships, and collaborations, connect with us at [email protected]. We'd love to hear from you!

Our humans need coffee too! Your support is highly appreciated, thank you!

aster.cloud

Related Topics
  • Cloud Native Computing Foundation
  • CNCF
  • GitHub
  • Kubernetes
  • Self-Healing Kubernetes
You May Also Like
Google Cloud and Smart Communications
View Post
  • Platforms
  • Technology

Smart Communications, Inc. Dials into Google Cloud AI to Help Personalize Digital Services for Filipinos

  • October 25, 2024
View Post
  • Platforms
  • Public Cloud

Empowering builders with the new AWS Asia Pacific (Malaysia) Region

  • August 30, 2024
Red Hat and Globe Telecoms
View Post
  • Platforms
  • Technology

Globe Collaborates with Red Hat Open Innovation Labs to Modernize IT Infrastructure for Greater Agility and Scalability

  • August 19, 2024
Huawei Cloud Cairo Region Goes Live
View Post
  • Cloud-Native
  • Computing
  • Platforms

Huawei Cloud Goes Live in Egypt

  • May 24, 2024
Asteroid
View Post
  • Computing
  • Platforms
  • Technology

Asteroid Institute And Google Cloud Identify 27,500 New Asteroids, Revolutionizing Minor Planet Discovery With Cloud Technology

  • April 30, 2024
IBM
View Post
  • Hybrid Cloud
  • Platforms

IBM To Acquire HashiCorp, Inc. Creating A Comprehensive End-to-End Hybrid Cloud Platform

  • April 24, 2024
View Post
  • Platforms
  • Technology

Canonical Delivers Secure, Compliant Cloud Solutions for Google Distributed Cloud

  • April 9, 2024
Redis logo
View Post
  • Platforms
  • Software

Redis Moves To Source-Available Licenses

  • April 2, 2024

Stay Connected!
LATEST
  • college-of-cardinals-2025 1
    The Definitive Who’s Who of the 2025 Papal Conclave
    • May 7, 2025
  • conclave-poster-black-smoke 2
    The World Is Revalidating Itself
    • May 6, 2025
  • 3
    Conclave: How A New Pope Is Chosen
    • April 25, 2025
  • Getting things done makes her feel amazing 4
    Nurturing Minds in the Digital Revolution
    • April 25, 2025
  • 5
    AI is automating our jobs – but values need to change if we are to be liberated by it
    • April 17, 2025
  • 6
    Canonical Releases Ubuntu 25.04 Plucky Puffin
    • April 17, 2025
  • 7
    United States Army Enterprise Cloud Management Agency Expands its Oracle Defense Cloud Services
    • April 15, 2025
  • 8
    Tokyo Electron and IBM Renew Collaboration for Advanced Semiconductor Technology
    • April 2, 2025
  • 9
    IBM Accelerates Momentum in the as a Service Space with Growing Portfolio of Tools Simplifying Infrastructure Management
    • March 27, 2025
  • 10
    Tariffs, Trump, and Other Things That Start With T – They’re Not The Problem, It’s How We Use Them
    • March 25, 2025
about
Hello World!

We are aster.cloud. We’re created by programmers for programmers.

Our site aims to provide guides, programming tips, reviews, and interesting materials for tech people and those who want to learn in general.

We would like to hear from you.

If you have any feedback, enquiries, or sponsorship request, kindly reach out to us at:

[email protected]
Most Popular
  • 1
    IBM contributes key open-source projects to Linux Foundation to advance AI community participation
    • March 22, 2025
  • 2
    Co-op mode: New partners driving the future of gaming with AI
    • March 22, 2025
  • 3
    Mitsubishi Motors Canada Launches AI-Powered “Intelligent Companion” to Transform the 2025 Outlander Buying Experience
    • March 10, 2025
  • PiPiPi 4
    The Unexpected Pi-Fect Deals This March 14
    • March 13, 2025
  • Nintendo Switch Deals on Amazon 5
    10 Physical Nintendo Switch Game Deals on MAR10 Day!
    • March 9, 2025
  • /
  • Technology
  • Tools
  • About
  • Contact Us

Input your search keywords and press Enter.