Thanks to visit codestin.com
Credit goes to github.com

Skip to content

[FEATURE_REQUEST] New check to verify that ttlSecondsAfterFinished is set in Job objects and not set in CronJob objects #963

@wissamir

Description

@wissamir

Description of the problem/feature request
Pods spawned by Job objects might stick around for long time if not explicitly deleted after finishing what they suppose to do. Users should be encouraged to purposefully set how long pods created by job objects should live and not to leave it to other clean-up mechanisms that might be triggered.

Description of the existing behavior vs. expected behavior
Job and CronJob Kubernetes objects spawn Pods to perform whatever job they are meant to execute.

In standalone Job objects, the pod's ttl is controlled by the field ttlSecondsAfterFinished which does not have a default value, therefore, when unset, the finished pod won't be deleted automatically unless garbage collection thresholds are triggered, which on nodes with large filesystems backing container storage, can potentially never run prior to issues involving the grpc message buffer size.

In managed Job objects (created by CronJob objects), setting ttlSecondsAfterFinished might interfere with successfulJobsHistoryLimit and failedJobsHistoryLimit from CronJob. Final behaviour is determined by the stricktier and can easily cause confusion and unexpected behaviour.

Therefore, reasonable linting targets are

  1. Advice setting ttlSecondsAfterFinished for standalone Job objects whenever it's not set
  2. Advice unsetting ttlSecondsAfterFinished for managed Job objects whenever it's set

Additional context
I'll be creating a PR to implement the new check

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions