ChaosToolKit Cluster Level Pod Delete Experiment Details in kube-system

Experiment Metadata

Type	Description	Tested K8s Platform
ChaosToolKit	ChaosToolKit Cluster Level Pod delete experiment	Kubeadm, Minikube

Prerequisites

Ensure that the Litmus ChaosOperator is running by executing kubectl get pods in operator namespace (typically, litmus). If not, install from here
Ensure that the k8-pod-delete experiment resource is available in the cluster by executing kubectl get chaosexperiments in the desired namespace. If not, install from here
Ensure you have nginx default application setup on default namespace ( if you are using specific namespace please execute below on that namespace)

Entry Criteria

Application replicas are healthy before chaos injection
Service resolution works successfully as determined by deploying a sample nginx application and a custom liveness app querying the nginx application health end point
This application we are executing against kube-system type namespace

Exit Criteria

Application replicas are healthy after chaos injection
Service resolution works successfully as determined by deploying a sample nginx application and a custom liveness app querying the nginx application health end point

Details

Causes graceful pod failure of an ChaosToolKit replicas bases on provided namespace and Label with endpoint
Tests deployment sanity check with Steady state hypothesis pre and post pod failures
Service resolution will failed if Application replicas are not present.

Use Cases for executing the experiment

Type	Experiment	Details	json
ChaosToolKit	ChaosToolKit single, random pod delete experiment with count	Executing via label name k8s-app=<>	pod-custom-kill-health.json
TEST_NAMESPACE	Place holder from where the chaos experiment is executed	Optional	Defaults to is `default`

Integrations

Pod failures can be effected using one of these chaos libraries: litmus

Steps to Execute the ChaosExperiment

This ChaosExperiment can be triggered by creating a ChaosEngine resource on the cluster. To understand the values to provide in a ChaosEngine specification, refer Getting Started
Follow the steps in the sections below to create the chaosServiceAccount, prepare the ChaosEngine & execute the experiment.

Prepare chaosServiceAccount

Based on your use case pick one of the choice from here https://hub.litmuschaos.io/kube-components/k8-prometheus-operator
- Service owner use case
  - Install the rbac for cluster in namespace from where you are executing the experiments kubectl apply rbac-admin.yaml

Sample Rbac Manifest for Cluster Owner use case

apiVersion: v1
kind: ServiceAccount
metadata:
  name: chaos-admin
  labels:
    name: chaos-admin
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: chaos-admin
  labels:
    name: chaos-admin
rules:
  - apiGroups: ["","apps","batch"]
    resources: ["jobs","deployments","daemonsets"]
    verbs: ["create","list","get","patch","delete"]
  - apiGroups: ["","litmuschaos.io"]
    resources: ["pods","configmaps","events","services","chaosengines","chaosexperiments","chaosresults","deployments","jobs"]
    verbs: ["get","create","update","patch","delete","list"] 
  - apiGroups: [""]
    resources: ["nodes"]
    verbs : ["get","list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: chaos-admin
  labels:
    name: chaos-admin
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: chaos-admin
subjects:
- kind: ServiceAccount
  name: chaos-admin
  namespace: default

Prepare ChaosEngine

Provide the application info in spec.appinfo

It will be default as

  appinfo:
    appns: addon-metricset-ns
    applabel: 'k8s-app=prometheus-operator'
    appkind: deployment

Override the experiment tunables if desired in experiments.spec.components.env
To understand the values to provided in a ChaosEngine specification, refer ChaosEngine Concepts

Supported Experiment Tunables

Variables	Description	Specify In ChaosEngine	Notes
NAME_SPACE	This is chaos namespace which will create all infra chaos resources in that namespace	Mandatory	Default to kube-system
LABEL_NAME	The default name of the label	Mandatory	Defaults to `k8s-app=prometheus-operator`
APP_ENDPOINT	Endpoint where ChaosToolKit will make a call and ensure the application endpoint is healthy	Mandatory	Defaults to localhost
FILE	Type of chaos experiments we want to execute	Mandatory	Default to `pod-custom-kill-health.json`
REPORT	The Report of execution coming in json format	Optional	Defaults to is `false`
REPORT_ENDPOINT	Report endpoint which can take the json format and submit it	Optional	Default to setup for Kafka topic for chaos, but can support any reporting database

Sample ChaosEngine Manifest

apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
  name: k8-prometheus-operator
  namespace: default
spec:
  appinfo:
    appns: 'default'
    applabel: "k8s-app=prometheus-operator"
    appkind: deployment
  annotationCheck: 'false'
  engineState: 'active'
  chaosServiceAccount: chaos-admin
  experiments:
    - name: k8-pod-delete
      spec:
        components:
          env:
            # set chaos namespace, we assume you are using the addon-metricset-ns if not modify the below namespace
            - name: NAME_SPACE
              value: addon-metricset-ns
            # set chaos label name
            - name: LABEL_NAME
              value: k8s-app=prometheus-operator
            # pod endpoint
            - name: APP_ENDPOINT
              value: 'localhost'
            - name: FILE
              value: 'pod-custom-kill-health.json'
            - name: REPORT
              value: 'false'
            - name: REPORT_ENDPOINT
              value: 'none'
            - name: TEST_NAMESPACE
              value: 'default'

Create the ChaosEngine Resource

Create the ChaosEngine manifest prepared in the previous step to trigger the Chaos.

kubectl apply -f chaosengine.yml

Watch Chaos progress

View ChaosToolKit pod terminations & recovery by setting up a watch on the ChaosToolKit pods in the application namespace

watch kubectl get pods -n kube-system

Check ChaosExperiment Result

Check whether the application is resilient to the ChaosToolKit pod failure, once the experiment (job) is completed. The ChaosResult resource name is derived like this: <ChaosEngine-Name>-<ChaosExperiment-Name>.

kubectl describe chaosresult k8-pod-delete -n <chaos-namespace>

Check ChaosExperiment logs

Check the log and result for existing experiment

kubectl log -f k8-pod-delete-<> -n <chaos-namespace>

Litmus Docs

1.13.6

OpenShift

Rancher

Generic

Kube-AWS

VM-PowerOff

OpenEBS

Kafka

CoreDns

Cassandra

Kube-Components

ChaosToolKit Cluster Level Pod Delete Experiment Details in kube-system

Experiment Metadata

Prerequisites

Entry Criteria

Exit Criteria

Details

Use Cases for executing the experiment

Integrations

Steps to Execute the ChaosExperiment

Prepare chaosServiceAccount

Sample Rbac Manifest for Cluster Owner use case

Prepare ChaosEngine

Supported Experiment Tunables

Sample ChaosEngine Manifest

Create the ChaosEngine Resource

Watch Chaos progress

Check ChaosExperiment Result

Check ChaosExperiment logs