Skip to content

[pyoutline] Add OpenJD backend for job template serialization#2180

Draft
mwiebe wants to merge 1 commit intoAcademySoftwareFoundation:masterfrom
mwiebe:openjd-prototype
Draft

[pyoutline] Add OpenJD backend for job template serialization#2180
mwiebe wants to merge 1 commit intoAcademySoftwareFoundation:masterfrom
mwiebe:openjd-prototype

Conversation

@mwiebe
Copy link

@mwiebe mwiebe commented Feb 20, 2026

Link the Issue(s) this Pull Request is related to.

OpenJobDescription/openjd-model-for-python#137

Summarize your change.
Add a new PyOutline backend that serializes Outline jobs into OpenJD 2023-09 job templates. This enables running OpenCue jobs via the openjd CLI or submitting them to OpenJD-compatible schedulers like AWS Deadline Cloud. It doesn't look at the opposite of running an OpenJD job template in OpenCue. That's a separate follow-up to perform.

This PR doesn't fix the linked issue, but implementing OpenCue's pyoutline API for defining a job is a nice way to get a feel for how such an API might look in the openjd-model library, and evaluate any missing functionality in Open Job Description for supporting OpenCue jobs.

This PR depends on an in-progress RFC for the EXPR extension in Open Job Description, so it probably should stay in draft until that is merged there.

The changes include:

  • Serialize layers to OpenJD steps with CHUNK[INT] frame parameters
  • Map OpenCue command tokens (#IFRAME#, #FRAMESPEC#, etc.) to OpenJD expressions using the EXPR extension
  • Map host requirements: cores, memory, GPUs, tags, service, limits
  • Map dependencies including Post layer implicit depends
  • Wrap commands in bash scripts that export CUE_* environment variables
  • Support Shell, ShellScript, and PyEval layer types
  • Validate generated templates using openjd-model
  • Launch jobs locally via openjd CLI
  • Install with: pip install opencue-pyoutline[openjd]

@linux-foundation-easycla
Copy link

linux-foundation-easycla bot commented Feb 20, 2026

CLA Signed

The committers listed above are authorized under a signed CLA.

  • ✅ login: mwiebe / name: Mark Wiebe (a829670)

Copy link
Collaborator

@DiegoTavares DiegoTavares left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work so far. I'm changing the PR description to add a checklist enlisting all features that are pending so we can keep tab on the status on the OpenJD project and even discuss what's essential.

Comment on lines +75 to +79
- Layer outputs: the cue backend (spec >= 1.15) serializes get_outputs().
Not yet mapped. OpenJD PATH parameters support dataFlow but only at
the job level, not per-step or per-task.
TODO: open a discussion on openjd-specifications for step- and task-level
file dataflow declarations (output path registration, existence checks).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is crucial for some pipelines. Being able to trace what output was produced by each layer gives other systems the ability to trace assets for example for backups and recovers.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, another reason is to be more efficient about data transfers or taking snapshots of just what's needed to run a particular task.

file dataflow declarations (output path registration, existence checks).
- Launcher flags: pause, priority, maxretries, autoeat, facility, show,
shot, user, email, uid are OpenCue scheduling metadata with no direct
OpenJD equivalent. The os flag could map to attr.worker.os.family.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there an equivalent way to launch a job as paused?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenJD doesn't have that concept, we modeled that in Deadline Cloud as part of a job that is running, while the job template at this level is concerned with defining the job's parallel structure and the commands it will run.

I think a number of these flags could map to job parameter name conventions, e.g. job parameters named Facility, Show, Shot, User, Email, Uid.

@ramonfigueiredo
Copy link
Collaborator

Hi @mwiebe

It is missing the CLA Authorization.

Thanks!

@ramonfigueiredo
Copy link
Collaborator

Please note that there are some Python linting issues that need to be addressed.

Additionally, when possible, remember to update the @docs/ for the new feature regarding the new feature in this pull request.

Copy link
Collaborator

@ramonfigueiredo ramonfigueiredo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your contribution, @mwiebe

I've added some comments and suggestions for code improvements.

attributes.append({"name": "opencue:attr.tag", "allOf": tag_list})

# Service → custom attribute opencue:attr.service
service = layer.get_service()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Service name not split on comma

The cue backend does layer.get_service().split(",")[0] to extract just the first service when multiple are comma-separated. The OpenJD backend passes the full string as-is:

service = layer.get_service()
if service:
    attributes.append({"name": "opencue:attr.service", "anyOf": [service]})

If a layer has service="prman,shell", this would create anyOf: ["prman,shell"] instead of anyOf: ["prman"] (matching cue) or anyOf: ["prman", "shell"] (fully expanded). Should be intentional one way or the other.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched it to match the cue backend's behavior, with a comment about considering the expanded anyOf case. Would be useful to work through some examples of how to map this in practice and see what makes the best sense.

cmd = ["openjd", "run", tmp.name]

logger.info("Running: %s", " ".join(cmd))
result = subprocess.run(cmd, check=False)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

launch() doesn't capture subprocess output

result = subprocess.run(cmd, check=False)

If openjd run fails, the error is just "openjd run exited with code N" with no diagnostic output. Consider capturing stderr:

result = subprocess.run(cmd, check=False, capture_output=True, text=True)
if result.returncode != 0:
    raise outline.exception.OutlineException(
        "openjd run exited with code %d: %s" % (result.returncode, result.stderr)
    )

_replace_tokens(part, _TOKENS) for part in shlex.split(str(command_parts))
]
else:
resolved = ["echo", "Running {{ Task.Param.Frame }}"]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Silent fallback for missing command

else:
    resolved = ["echo", "Running {{ Task.Param.Frame }}"]

When a Shell layer has no command set, the backend silently generates an echo stub instead of raising an error. This could mask configuration mistakes. Consider raising LayerException here, since a layer with no command is almost certainly a bug.

Comment on lines +286 to +312
def _parse_memory_to_mib(value: str) -> int:
"""Convert a cuebot-style memory string to MiB for OpenJD.

Cuebot's convertMemoryInput interprets memory values as:
- ``"512m"`` → 512 megabytes (value in MB)
- ``"4g"`` → 4 gigabytes (value in GB)
- ``"4"`` → 4 gigabytes (plain number, treated as GB)

All conversions produce an integer MiB value.

Raises:
LayerException: If the value cannot be parsed.
"""
v = value.strip().lower()
try:
if v.endswith("m"):
return int(float(v[:-1]))
elif v.endswith("g"):
return int(float(v[:-1]) * 1024)
else:
# Plain number — cuebot treats as gigabytes.
return int(float(v) * 1024)
except (ValueError, TypeError):
raise outline.exception.LayerException(
"Invalid memory value: %r — expected a number with optional "
"'m' (megabytes) or 'g' (gigabytes) suffix" % value
)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_parse_memory_to_mib treats MB = MiB and GB = GiB: openjd.py:286-312

The docstring says "megabytes" / "gigabytes" but the code returns values as-if they were MiB/GiB (no 1000->1024 conversion). For example, "512m" → 512 (MiB), "4g" -> 4096 (MiB). This matches cuebot's behavior (it also doesn't distinguish), so it's correct for compatibility, but the docstring could clarify that it follows cuebot's convention of treating MB≈MiB.

# runs before setup.
# TODO: add a public get_code() method to PyEval so backends don't
# need to access the name-mangled attribute.
code = layer._PyEval__code
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Private attribute access for PyEval

code = layer._PyEval__code

Acknowledged with a TODO. This is fragile, if the PyEval class renames or restructures its internals, this will break silently. Since this is a prototype, fine, but this should be addressed before any production use.

@@ -1 +1 @@
[build-system]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yaml (PyYAML) is not an explicit dependency:

The module imports yaml at the top level, but PyYAML isn't listed in pyproject.toml dependencies. It's likely pulled in transitively through opencue_pycue, but for correctness the dependency should be explicit, or at minimum noted.

# ---------------------------------------------------------------------------


def _frame_range_to_openjd(frame_range: str, chunk_size: int) -> str:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_frame_range_to_openjd doesn't use chunk_size parameter

The chunk_size parameter is accepted but never used. Either remove it, or add a comment if it's reserved for future use.

return result


def _build_environment(layer, launcher) -> Optional[List[Dict[str, Any]]]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_build_environment always returns a list

The return type is Optional[List[...]] but the function always returns a non-empty list (CUE_* vars are always present). The Optional in the signature is misleading.

wrapper += env_exports
# 'exec' replaces the wrapper shell with the actual command process,
# so the session runner tracks the real PID for timeout/cancellation.
wrapper += "exec %s\n" % cmd_line
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_build_step_script - exec with %s formatting

wrapper += "exec %s\n" % cmd_line

The cmd_line is already shell-safe, so this is fine for security. But using f-string would be more idiomatic with the rest of the codebase:

wrapper += f"exec {cmd_line}\n"

(Minor inconsistency - the file mixes %-formatting and f-strings, e.g., line 480 uses f-strings.)

@mwiebe
Copy link
Author

mwiebe commented Mar 4, 2026

Hi @mwiebe

It is missing the CLA Authorization.

Thanks!

This is signed now, thanks!

@mwiebe mwiebe force-pushed the openjd-prototype branch from f252b29 to 27d8387 Compare March 5, 2026 02:51
Add a new PyOutline backend that serializes Outline jobs into OpenJD
2023-09 job templates. This enables running OpenCue jobs via the openjd
CLI or submitting them to OpenJD-compatible schedulers like AWS Deadline
Cloud.

- Serialize layers to OpenJD steps with CHUNK[INT] frame parameters
- Map OpenCue command tokens (#IFRAME#, #FRAMESPEC#, etc.) to OpenJD
  expressions using the EXPR extension
- Map host requirements: cores, memory, GPUs, tags, service, limits
- Map dependencies including Post layer implicit depends
- Wrap commands in bash scripts that export CUE_* environment variables
- Support Shell, ShellScript, and PyEval layer types
- Validate generated templates using openjd-model
- Launch jobs locally via openjd CLI
- Install with: pip install opencue-pyoutline[openjd]
@mwiebe mwiebe force-pushed the openjd-prototype branch from 27d8387 to a829670 Compare March 5, 2026 04:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants