Skip to content

Spark CR deletion leaves orphaned pods #701

@Maleware

Description

@Maleware

Affected Stackable version

26.3.0

Affected Apache Spark-on-Kubernetes version

all

Current and expected behavior

Before 26.3

CR held owner reference onto driver and driver onto executer which lead to deletion of pods when deleting the CR. It was convenient since reinstalling the SparkApp requires to delete the old CR.

After 26.3

CR doesn't have owner reference on driver pod, driver still owns the executer. Means, deleting the CR now leaves driver and executer untouched and thus run uncontrolled until manually deletion. The operator auto deletes the driver pod once it reaches "terminating" phase which might not happen during uninstalls e.g.

Proposal

Deleting a CR purposefully should lead to the product being uninstalled. Maybe finalizer on the SparkApplication can help. We could on deletionTimestamp delete the driver pod which then cascades to the executer due to ownerRef. This would preserve the behaviour that if driver reaches terminating, it gets cleaned up by the operator and thus its executor.

Possible solution

No response

Additional context

No response

Environment

No response

Would you like to work on fixing this bug?

None

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No fields configured for Bug.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions