Skip to content

Improve From Source CI Jobs #49301

@alzimmermsft

Description

@alzimmermsft

Currently, most From Source CI testing jobs are fine, but there are a few cases where the From Source job builds and tests a non-trivial portion of the repository. Those jobs can take a long time to run (>30 minutes) and commonly see failures due to intermittent test flakiness, either due to timing issues in the test and resource contention in the runner or due to external issues such as pulling Test Proxy test recordings or timeouts in Test Proxy due to resource contention. We should look into a potential redesign of the From Source job.

A potential redesign would be leveraging an initializer stage that generates the job matrix on the fly rather than using a static job matrix. This would then allow for non-From Source jobs to be injected as-is, as they're relatively scoped, while allowing the From Source job to be potentially split into many jobs if it will be building and testing a large portion of the repository. This logic could go as follows:

  1. Determine the number of projects that will be built and tested using a modified version of Generate FromSource POM.
  2. If under a certain number of projects, use a single From Source job as we do today. If over a certain number of projects, do the following.
  3. Select projects to be considered the From Source entry point based on the number of downstream projects the trigger to be tested. For example, azure-storage-common, the common dependency for azure-storage-* libraries which are frequently used by other SDKs could be a root as it triggers many libraries to be built. /sdk/core and /sdk/identity libraries would be excluded as they effectively cause the entire repo to be built.
  4. Each root project would trigger a From Source job to be injected into the matrix.

Making this change would split large From Source jobs into a few smaller From Source jobs. While overall CI time usage would go up as it would cause common functionality to be repeated and may result in library build and testing overlap, it would remove long running and frequently flaky jobs resulting in a better overall experience.

Metadata

Metadata

Assignees

No one assigned

    Labels

    EngSysThis issue is impacting the engineering system.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions