Skip to content

Workers don't terminate after tests finish #164

@avik-pal

Description

@avik-pal

I have been seeing this specifically on GPU tests. See the logs in the link https://buildkite.com/julialang/luxlib-dot-jl/builds/797#0190cc64-0b5a-4e2a-9e47-795d8fa7176e/309-616

The Batch Normalization, Group Normalization, and Instance Normalization tests are "DONE" but those workers never terminate, which eventually leads to the job timing out. This problem doesn't show up when the same tests are run on Github Actions (exclusively CPU tests).

If I set the number of workers to not run in parallel then tests finish as expected. I have ReTestItems setup to run GPU testing on other repos (and they work perfectly), so I am not sure what is causing this issue.

P.S. This repo is amazing, it has cut down on our CI timings a great deal (and makes local testing so much easier)!

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions