Scheduler Usage

The ChaosSchedule is the user-facing chaos custom resource with a namespace scope and is built on top of ChaosEngine. It schedules a number of instances of chaosengine according to the given schedule specifications.

It basically helps for the scheduled execution of chaos immediately, at a specific time or helps to execute chaos repeatedly with a specific time interval in between two instances. It brings more resiliency to the cluster because of the repeated executions of chaos and not just a single time as is the case of chaos engine.

Prepare ChaosSchedule

There are 3 scheduling strategies by which we can trigger the formation of chaosengine -

now
once
repeat

They can only be followed one at a time.

Sample ChaosSchedule Manifest for Now

apiVersion: litmuschaos.io/v1alpha1
kind: ChaosSchedule
metadata:
  name: schedule-nginx
spec:
  schedule:
    now: true
  engineTemplateSpec:
    appinfo:
      appns: 'default'
      applabel: 'app=nginx'
      appkind: 'deployment'
    # It can be true/false
    annotationCheck: 'true'
    #ex. values: ns1:name=percona,ns2:run=nginx
    auxiliaryAppInfo: ''
    chaosServiceAccount: pod-delete-sa
    monitoring: false
    # It can be delete/retain
    jobCleanUpPolicy: 'delete'
    experiments:
      - name: pod-delete
        spec:
          components:
            env:
              # set chaos duration (in sec) as desired
              - name: TOTAL_CHAOS_DURATION
                value: '30'

              # set chaos interval (in sec) as desired
              - name: CHAOS_INTERVAL
                value: '10'

              # pod failures without '--force' & default terminationGracePeriodSeconds
              - name: FORCE
                value: 'false'

Sample ChaosSchedule Manifest for Once

apiVersion: litmuschaos.io/v1alpha1
kind: ChaosSchedule
metadata:
  name: schedule-nginx
spec:
  schedule:
    once:
      executionTime: "2020-05-12T05:47:00Z"   #should be modified according to current UTC Time
  engineTemplateSpec:
    appinfo:
      appns: 'default'
      applabel: 'app=nginx'
      appkind: 'deployment'
    # It can be true/false
    annotationCheck: 'true'
    #ex. values: ns1:name=percona,ns2:run=nginx
    auxiliaryAppInfo: ''
    chaosServiceAccount: pod-delete-sa
    monitoring: false
    # It can be delete/retain
    jobCleanUpPolicy: 'delete'
    experiments:
      - name: pod-delete
        spec:
          components:
            env:
              # set chaos duration (in sec) as desired
              - name: TOTAL_CHAOS_DURATION
                value: '30'

              # set chaos interval (in sec) as desired
              - name: CHAOS_INTERVAL
                value: '10'

              # pod failures without '--force' & default terminationGracePeriodSeconds
              - name: FORCE
                value: 'false'

Sample ChaosSchedule Manifest for Repeat

apiVersion: litmuschaos.io/v1alpha1
kind: ChaosSchedule
metadata:
  name: schedule-nginx
spec:
  schedule:
    repeat:
      startTime: "2020-05-12T05:47:00Z"   #should be modified according to current UTC Time
      endTime: "2020-05-12T05:52:00Z"   #should be modified according to current UTC Time
      minChaosInterval: "2m"   #format should be like "10m" or "2h" accordingly for minutes and   hours
      instanceCount: "2"
      includedDays: "mon,tue,wed"
  engineTemplateSpec:
    appinfo:
      appns: 'default'
      applabel: 'app=nginx'
      appkind: 'deployment'
    # It can be true/false
    annotationCheck: 'true'
    #ex. values: ns1:name=percona,ns2:run=nginx
    auxiliaryAppInfo: ''
    chaosServiceAccount: pod-delete-sa
    monitoring: false
    # It can be delete/retain
    jobCleanUpPolicy: 'delete'
    experiments:
      - name: pod-delete
        spec:
          components:
            env:
              # set chaos duration (in sec) as desired
              - name: TOTAL_CHAOS_DURATION
                value: '30'

              # set chaos interval (in sec) as desired
              - name: CHAOS_INTERVAL
                value: '10'

              # pod failures without '--force' & default terminationGracePeriodSeconds
              - name: FORCE
                value: 'false'

Halt/Resume ChaosSchedule

Schdeules can be halted or resumed as per the need. But it only works for once or repeat strategy as now immediately creates the engine.

For halting an active schedule we can simply follow these steps -

Edit the applied chaosschedule

kubectl edit chaosschedule schedule-nginx

Halt the schedule

Change the spec.scheduleState to halt

spec:
  scheduleState: halt
  ...

For resuming a halted schedule we can simply follow these steps -

Edit the applied chaosschedule

kubectl edit chaosschedule schedule-nginx

Resume the schedule

Change the spec.scheduleState to active

spec:
  scheduleState: active
  ...

Litmus Docs

1.7.0

OpenShift

Rancher

Generic

Kubernetes

OpenEBS

Kafka

CoreDns

Cassandra

Prepare ChaosSchedule

Sample ChaosSchedule Manifest for Now

Sample ChaosSchedule Manifest for Once

Sample ChaosSchedule Manifest for Repeat

Halt/Resume ChaosSchedule

Edit the applied chaosschedule

Halt the schedule

Edit the applied chaosschedule

Resume the schedule