ChaosToolKit Cluster Level Pod Delete Experiment Details in kube-system
Experiment Metadata
Type | Description | Tested K8s Platform |
---|---|---|
ChaosToolKit | ChaosToolKit Cluster Level Pod delete experiment | Kubeadm, Minikube |
Prerequisites
- Ensure that the Litmus ChaosOperator is running by executing
kubectl get pods
in operator namespace (typically,litmus
). If not, install from here - Ensure that the
k8-pod-delete
experiment resource is available in the cluster by executingkubectl get chaosexperiments
in the desired namespace. If not, install from here - Ensure you have nginx default application setup on default namespace ( if you are using specific namespace please execute below on that namespace)
Entry Criteria
- Application replicas are healthy before chaos injection
- Service resolution works successfully as determined by deploying a sample nginx application and a custom liveness app querying the nginx application health end point
- This application we are executing against kube-system type namespace
Exit Criteria
- Application replicas are healthy after chaos injection
- Service resolution works successfully as determined by deploying a sample nginx application and a custom liveness app querying the nginx application health end point
Details
- Causes graceful pod failure of an ChaosToolKit replicas bases on provided namespace and Label with endpoint
- Tests deployment sanity check with Steady state hypothesis pre and post pod failures
- Service resolution will failed if Application replicas are not present.
Use Cases for executing the experiment
Type | Experiment | Details | json |
---|---|---|---|
ChaosToolKit | ChaosToolKit single, random pod delete experiment with count | Executing via label name k8s-app=<> | pod-custom-kill-health.json |
TEST_NAMESPACE | Place holder from where the chaos experiment is executed | Optional | Defaults to is `default` |
Integrations
- Pod failures can be effected using one of these chaos libraries:
litmus
Steps to Execute the ChaosExperiment
This ChaosExperiment can be triggered by creating a ChaosEngine resource on the cluster. To understand the values to provide in a ChaosEngine specification, refer Getting Started
Follow the steps in the sections below to create the chaosServiceAccount, prepare the ChaosEngine & execute the experiment.
Prepare chaosServiceAccount
- Based on your use case pick one of the choice from here
https://hub.litmuschaos.io/kube-components/k8-prometheus-operator
- Service owner use case
- Install the rbac for cluster in namespace from where you are executing the experiments
kubectl apply rbac-admin.yaml
- Install the rbac for cluster in namespace from where you are executing the experiments
- Service owner use case
Sample Rbac Manifest for Cluster Owner use case
apiVersion: v1
kind: ServiceAccount
metadata:
name: chaos-admin
labels:
name: chaos-admin
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
name: chaos-admin
labels:
name: chaos-admin
rules:
- apiGroups: ["","apps","batch"]
resources: ["jobs","deployments","daemonsets"]
verbs: ["create","list","get","patch","delete"]
- apiGroups: ["","litmuschaos.io"]
resources: ["pods","configmaps","events","services","chaosengines","chaosexperiments","chaosresults","deployments","jobs"]
verbs: ["get","create","update","patch","delete","list"]
- apiGroups: [""]
resources: ["nodes"]
verbs : ["get","list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
name: chaos-admin
labels:
name: chaos-admin
roleRef:
apiGroup: rbac.authorization.k8s.io
kind: ClusterRole
name: chaos-admin
subjects:
- kind: ServiceAccount
name: chaos-admin
namespace: default
Prepare ChaosEngine
Provide the application info in
spec.appinfo
- It will be default as
appinfo: appns: addon-metricset-ns applabel: 'k8s-app=prometheus-operator' appkind: deployment
- It will be default as
Override the experiment tunables if desired in
experiments.spec.components.env
To understand the values to provided in a ChaosEngine specification, refer ChaosEngine Concepts
Supported Experiment Tunables
Variables | Description | Specify In ChaosEngine | Notes |
---|---|---|---|
NAME_SPACE | This is chaos namespace which will create all infra chaos resources in that namespace | Mandatory | Default to kube-system |
LABEL_NAME | The default name of the label | Mandatory | Defaults to `k8s-app=prometheus-operator` |
APP_ENDPOINT | Endpoint where ChaosToolKit will make a call and ensure the application endpoint is healthy | Mandatory | Defaults to localhost |
FILE | Type of chaos experiments we want to execute | Mandatory | Default to `pod-custom-kill-health.json` |
REPORT | The Report of execution coming in json format | Optional | Defaults to is `false` |
REPORT_ENDPOINT | Report endpoint which can take the json format and submit it | Optional | Default to setup for Kafka topic for chaos, but can support any reporting database |
Sample ChaosEngine Manifest
apiVersion: litmuschaos.io/v1alpha1
kind: ChaosEngine
metadata:
name: k8-prometheus-operator
namespace: default
spec:
appinfo:
appns: 'default'
applabel: "k8s-app=prometheus-operator"
appkind: deployment
annotationCheck: 'false'
engineState: 'active'
chaosServiceAccount: chaos-admin
experiments:
- name: k8-pod-delete
spec:
components:
env:
# set chaos namespace, we assume you are using the addon-metricset-ns if not modify the below namespace
- name: NAME_SPACE
value: addon-metricset-ns
# set chaos label name
- name: LABEL_NAME
value: k8s-app=prometheus-operator
# pod endpoint
- name: APP_ENDPOINT
value: 'localhost'
- name: FILE
value: 'pod-custom-kill-health.json'
- name: REPORT
value: 'false'
- name: REPORT_ENDPOINT
value: 'none'
- name: TEST_NAMESPACE
value: 'default'
Create the ChaosEngine Resource
Create the ChaosEngine manifest prepared in the previous step to trigger the Chaos.
kubectl apply -f chaosengine.yml
Watch Chaos progress
View ChaosToolKit pod terminations & recovery by setting up a watch on the ChaosToolKit pods in the application namespace
watch kubectl get pods -n kube-system
Check ChaosExperiment Result
Check whether the application is resilient to the ChaosToolKit pod failure, once the experiment (job) is completed. The ChaosResult resource name is derived like this:
<ChaosEngine-Name>-<ChaosExperiment-Name>
.kubectl describe chaosresult k8-pod-delete -n <chaos-namespace>
Check ChaosExperiment logs
Check the log and result for existing experiment
kubectl log -f k8-pod-delete-<> -n <chaos-namespace>