[pyoutline] Add OpenJD backend for job template serialization#2180
[pyoutline] Add OpenJD backend for job template serialization#2180mwiebe wants to merge 1 commit intoAcademySoftwareFoundation:masterfrom
Conversation
|
|
| - Layer outputs: the cue backend (spec >= 1.15) serializes get_outputs(). | ||
| Not yet mapped. OpenJD PATH parameters support dataFlow but only at | ||
| the job level, not per-step or per-task. | ||
| TODO: open a discussion on openjd-specifications for step- and task-level | ||
| file dataflow declarations (output path registration, existence checks). |
There was a problem hiding this comment.
This is crucial for some pipelines. Being able to trace what output was produced by each layer gives other systems the ability to trace assets for example for backups and recovers.
There was a problem hiding this comment.
Agreed, another reason is to be more efficient about data transfers or taking snapshots of just what's needed to run a particular task.
| file dataflow declarations (output path registration, existence checks). | ||
| - Launcher flags: pause, priority, maxretries, autoeat, facility, show, | ||
| shot, user, email, uid are OpenCue scheduling metadata with no direct | ||
| OpenJD equivalent. The os flag could map to attr.worker.os.family. |
There was a problem hiding this comment.
Is there an equivalent way to launch a job as paused?
There was a problem hiding this comment.
OpenJD doesn't have that concept, we modeled that in Deadline Cloud as part of a job that is running, while the job template at this level is concerned with defining the job's parallel structure and the commands it will run.
I think a number of these flags could map to job parameter name conventions, e.g. job parameters named Facility, Show, Shot, User, Email, Uid.
|
Hi @mwiebe It is missing the CLA Authorization. Thanks! |
|
Please note that there are some Python linting issues that need to be addressed. Additionally, when possible, remember to update the @docs/ for the new feature regarding the new feature in this pull request. |
ramonfigueiredo
left a comment
There was a problem hiding this comment.
Thanks for your contribution, @mwiebe
I've added some comments and suggestions for code improvements.
| attributes.append({"name": "opencue:attr.tag", "allOf": tag_list}) | ||
|
|
||
| # Service → custom attribute opencue:attr.service | ||
| service = layer.get_service() |
There was a problem hiding this comment.
Service name not split on comma
The cue backend does layer.get_service().split(",")[0] to extract just the first service when multiple are comma-separated. The OpenJD backend passes the full string as-is:
service = layer.get_service()
if service:
attributes.append({"name": "opencue:attr.service", "anyOf": [service]})If a layer has service="prman,shell", this would create anyOf: ["prman,shell"] instead of anyOf: ["prman"] (matching cue) or anyOf: ["prman", "shell"] (fully expanded). Should be intentional one way or the other.
There was a problem hiding this comment.
I switched it to match the cue backend's behavior, with a comment about considering the expanded anyOf case. Would be useful to work through some examples of how to map this in practice and see what makes the best sense.
| cmd = ["openjd", "run", tmp.name] | ||
|
|
||
| logger.info("Running: %s", " ".join(cmd)) | ||
| result = subprocess.run(cmd, check=False) |
There was a problem hiding this comment.
launch() doesn't capture subprocess output
result = subprocess.run(cmd, check=False)If openjd run fails, the error is just "openjd run exited with code N" with no diagnostic output. Consider capturing stderr:
result = subprocess.run(cmd, check=False, capture_output=True, text=True)
if result.returncode != 0:
raise outline.exception.OutlineException(
"openjd run exited with code %d: %s" % (result.returncode, result.stderr)
)| _replace_tokens(part, _TOKENS) for part in shlex.split(str(command_parts)) | ||
| ] | ||
| else: | ||
| resolved = ["echo", "Running {{ Task.Param.Frame }}"] |
There was a problem hiding this comment.
Silent fallback for missing command
else:
resolved = ["echo", "Running {{ Task.Param.Frame }}"]When a Shell layer has no command set, the backend silently generates an echo stub instead of raising an error. This could mask configuration mistakes. Consider raising LayerException here, since a layer with no command is almost certainly a bug.
| def _parse_memory_to_mib(value: str) -> int: | ||
| """Convert a cuebot-style memory string to MiB for OpenJD. | ||
|
|
||
| Cuebot's convertMemoryInput interprets memory values as: | ||
| - ``"512m"`` → 512 megabytes (value in MB) | ||
| - ``"4g"`` → 4 gigabytes (value in GB) | ||
| - ``"4"`` → 4 gigabytes (plain number, treated as GB) | ||
|
|
||
| All conversions produce an integer MiB value. | ||
|
|
||
| Raises: | ||
| LayerException: If the value cannot be parsed. | ||
| """ | ||
| v = value.strip().lower() | ||
| try: | ||
| if v.endswith("m"): | ||
| return int(float(v[:-1])) | ||
| elif v.endswith("g"): | ||
| return int(float(v[:-1]) * 1024) | ||
| else: | ||
| # Plain number — cuebot treats as gigabytes. | ||
| return int(float(v) * 1024) | ||
| except (ValueError, TypeError): | ||
| raise outline.exception.LayerException( | ||
| "Invalid memory value: %r — expected a number with optional " | ||
| "'m' (megabytes) or 'g' (gigabytes) suffix" % value | ||
| ) |
There was a problem hiding this comment.
_parse_memory_to_mib treats MB = MiB and GB = GiB: openjd.py:286-312
The docstring says "megabytes" / "gigabytes" but the code returns values as-if they were MiB/GiB (no 1000->1024 conversion). For example, "512m" → 512 (MiB), "4g" -> 4096 (MiB). This matches cuebot's behavior (it also doesn't distinguish), so it's correct for compatibility, but the docstring could clarify that it follows cuebot's convention of treating MB≈MiB.
| # runs before setup. | ||
| # TODO: add a public get_code() method to PyEval so backends don't | ||
| # need to access the name-mangled attribute. | ||
| code = layer._PyEval__code |
There was a problem hiding this comment.
Private attribute access for PyEval
code = layer._PyEval__codeAcknowledged with a TODO. This is fragile, if the PyEval class renames or restructures its internals, this will break silently. Since this is a prototype, fine, but this should be addressed before any production use.
| @@ -1 +1 @@ | |||
| [build-system] | |||
There was a problem hiding this comment.
yaml (PyYAML) is not an explicit dependency:
The module imports yaml at the top level, but PyYAML isn't listed in pyproject.toml dependencies. It's likely pulled in transitively through opencue_pycue, but for correctness the dependency should be explicit, or at minimum noted.
| # --------------------------------------------------------------------------- | ||
|
|
||
|
|
||
| def _frame_range_to_openjd(frame_range: str, chunk_size: int) -> str: |
There was a problem hiding this comment.
_frame_range_to_openjd doesn't use chunk_size parameter
The chunk_size parameter is accepted but never used. Either remove it, or add a comment if it's reserved for future use.
| return result | ||
|
|
||
|
|
||
| def _build_environment(layer, launcher) -> Optional[List[Dict[str, Any]]]: |
There was a problem hiding this comment.
_build_environment always returns a list
The return type is Optional[List[...]] but the function always returns a non-empty list (CUE_* vars are always present). The Optional in the signature is misleading.
| wrapper += env_exports | ||
| # 'exec' replaces the wrapper shell with the actual command process, | ||
| # so the session runner tracks the real PID for timeout/cancellation. | ||
| wrapper += "exec %s\n" % cmd_line |
There was a problem hiding this comment.
_build_step_script - exec with %s formatting
wrapper += "exec %s\n" % cmd_lineThe cmd_line is already shell-safe, so this is fine for security. But using f-string would be more idiomatic with the rest of the codebase:
wrapper += f"exec {cmd_line}\n"(Minor inconsistency - the file mixes %-formatting and f-strings, e.g., line 480 uses f-strings.)
efd7d4b to
f252b29
Compare
This is signed now, thanks! |
Add a new PyOutline backend that serializes Outline jobs into OpenJD 2023-09 job templates. This enables running OpenCue jobs via the openjd CLI or submitting them to OpenJD-compatible schedulers like AWS Deadline Cloud. - Serialize layers to OpenJD steps with CHUNK[INT] frame parameters - Map OpenCue command tokens (#IFRAME#, #FRAMESPEC#, etc.) to OpenJD expressions using the EXPR extension - Map host requirements: cores, memory, GPUs, tags, service, limits - Map dependencies including Post layer implicit depends - Wrap commands in bash scripts that export CUE_* environment variables - Support Shell, ShellScript, and PyEval layer types - Validate generated templates using openjd-model - Launch jobs locally via openjd CLI - Install with: pip install opencue-pyoutline[openjd]
Link the Issue(s) this Pull Request is related to.
OpenJobDescription/openjd-model-for-python#137
Summarize your change.
Add a new PyOutline backend that serializes Outline jobs into OpenJD 2023-09 job templates. This enables running OpenCue jobs via the openjd CLI or submitting them to OpenJD-compatible schedulers like AWS Deadline Cloud. It doesn't look at the opposite of running an OpenJD job template in OpenCue. That's a separate follow-up to perform.
This PR doesn't fix the linked issue, but implementing OpenCue's pyoutline API for defining a job is a nice way to get a feel for how such an API might look in the openjd-model library, and evaluate any missing functionality in Open Job Description for supporting OpenCue jobs.
This PR depends on an in-progress RFC for the EXPR extension in Open Job Description, so it probably should stay in draft until that is merged there.
The changes include: