Skip to content

github repo specific schedulerId #143

@petemoore

Description

@petemoore

This issue has been extracted (and slightly rewritten) from this comment of issue #16.

Tasks created by taskcluster-github have "schedulerId": "taskcluster-github".

To cancel a taskcluster-github-created task requires scope queue:cancel-task:taskcluster-github/<taskGroupId>/<taskId>. Since taskGroupId and taskId do not follow a repo-specific naming pattern, the scope queue:cancel-task:taskcluster-github/* is the only scope assignment that serves the general purpose of being able to cancel any taskcluster-github task for a given repo, without the possibility to restrict this to an individual github repo.

By using unique github scheduler ids per repo, this limitation would be lifted. If tasks created for repo github.com/foo/bar were to have (e.g.) "schedulerId": "github-foo-bar", then to cancel a task, a client would need to have queue:cancel-task:github-foo-bar/<taskGroupId>/<taskId> rather than queue:cancel-task:taskcluster-github/<taskGroupId>/<taskId> so it would be relatively straightforward to grant queue:cancel-task:github-foo-bar/* to roles/clients that should be able to cancel any task for only this repo. They would then not be able to cancel tasks for other github repos, as they currently can now.

Note, one complication is that schedulerIds are currently limited to ^([a-zA-Z0-9-_]*)$ with a maximum limit of 38 chars, so the github org/user + repository name cannot be simply embedded in the schedulerId since this will not necessarily comply with the required schedulerId pattern. We should therefore define the schedulerId as a function of the org/user and repo name, that satisfies the following properties:

  1. (Required) It always returns a schedulerId that conforms to the required regexp for schedulerId.
  2. (Required) It returns a schedulerId that is unique per repo.
  3. (Preferred) The github org/user and repo name are reasonably easy to determine from the schedulerId (i.e. the function is reverse-engineerable), or if not, it is a simple and well-defined lexical function that users could implement themselves to predict the schedulerId in any tooling they may wish to create.

One example of such a function (in this illustration written in go) could be the schedulerId function below:

import (
	"crypto/sha256"
	"fmt"
)

func schedulerId(userOrOrg, repoName string) string {

	qualifiedRepo := stripASCII(userOrOrg) + "-" + stripASCII(repoName)
	if len(qualifiedRepo) <= 35 {
		return "gh-" + qualifiedRepo
	}
	return "gh-" + qualifiedRepo[0:30] + hash(qualifiedRepo)[0:5]
}

func hash(orig string) (hashed string) {
	return fmt.Sprintf("%x", sha256.Sum256([]byte(orig)))
}

func stripASCII(orig string) (stripped string) {
	for _, char := range orig {
		if (char >= '0' && char <= '9') || (char >= 'a' && char <= 'z') || (char >= 'A' && char <= 'Z') {
			stripped += string(char)
		}
	}
	return
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions