-
Notifications
You must be signed in to change notification settings - Fork 184
Open
Description
Currently on the LHCb HLT farm the CE is configured to send SIGUSR1 to the Gaudi processes directly.
This has a couple of issues:
- If there is more than one job per pilot, the first job will exit gracefully but the pilot isn't aware of the graceful shutdown and will start another pilot.
- Sometimes the job hasn't got far enough to produce any output despite the situation being "okay"
I think the solution to both items is to have the CE communciate with DIRAC instead of Gaudi.
For 2 it would be useful to make it easier to filter out such cases, e.g. set the status to "Killed" and set a clearer application status.
aldbr
Metadata
Metadata
Assignees
Labels
No labels