Skip to content

Conversation

@zxqlxy
Copy link

@zxqlxy zxqlxy commented Aug 19, 2025

Add the local lock to avoid deadlock when we enable the maxSurge rollingUpdate strategy as there will be two gpu-device-plugin running at the same time during the rollingUpdate. Use remote node status when there is no lock there to prevent deadlock though a little bit slower.

@zxqlxy zxqlxy requested a review from linxiulei August 27, 2025 21:28
glog.Infof("Failed to build kube client: %v", err)
return
}
nodeName, err := util.GetEnv(nodeNameEnv)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would require a change in DaemonSet yaml, right? should we getHostname if failed to get this env var?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants