Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
246 changes: 244 additions & 2 deletions docs/AuthoringApps.md
Original file line number Diff line number Diff line change
Expand Up @@ -64,9 +64,10 @@ requirements:
- "pixi>=0.40"
- npm
- "maven>=3.9"
- miniforge
```

**Supported tools:** `pixi`, `npm`, `maven`
**Supported tools:** `pixi`, `npm`, `maven`, `miniforge`, `apptainer`, `nextflow`

**Supported version operators:** `>=`, `<=`, `!=`, `==`, `>`, `<`

Expand All @@ -80,13 +81,18 @@ Each runnable defines a single command that users can launch. If the manifest ha
|-------|------|----------|-------------|
| `id` | string | yes | Unique identifier (used in CLI flags and URLs, should be URL-safe) |
| `name` | string | yes | Display name shown in the UI |
| `type` | string | no | `"job"` (default) for batch jobs or `"service"` for long-running services (see [Services](#services)) |
| `description` | string | no | Longer description of what this runnable does |
| `command` | string | yes | Base shell command to execute (see [Command Building](#command-building)) |
| `parameters` | list of objects | no | Parameter definitions (see [Parameters](#parameters)) |
| `resources` | object | no | Default cluster resource requests (see [Resources](#resources)) |
| `env` | object | no | Default environment variables to export (see [Environment Variables](#environment-variables)) |
| `pre_run` | string | no | Shell script to run before the main command (see [Pre/Post-Run Scripts](#prepost-run-scripts)) |
| `post_run` | string | no | Shell script to run after the main command (see [Pre/Post-Run Scripts](#prepost-run-scripts)) |
| `conda_env` | string | no | Conda environment name or absolute path to activate before running (see [Conda Environments](#conda-environments)) |
| `container` | string | no | Container image URL for Apptainer (see [Containers](#containers-apptainer)) |
| `bind_paths` | list of strings | no | Additional paths to bind-mount into the container (requires `container`) |
| `container_args` | string | no | Default extra arguments for container exec (e.g. `--nv`), overridable at launch time |

### Parameters

Expand Down Expand Up @@ -220,7 +226,12 @@ The generated job script has the following structure:

```bash
unset PIXI_PROJECT_MANIFEST
cd /path/to/repo
export FG_WORK_DIR='/home/user/.fileglancer/jobs/42-MyApp-convert'
cd "$FG_WORK_DIR/repo"

# Conda activation (if conda_env is set)
eval "$(conda shell.bash hook)"
conda activate myenv

# Environment variables
export JAVA_HOME='/opt/java'
Expand All @@ -238,6 +249,110 @@ nextflow run main.nf \
echo "Conversion complete"
```

`FG_WORK_DIR` is always exported and points to the job's working directory. See [Environment Variables Set by Fileglancer](#environment-variables-set-by-fileglancer) for the full list.

### Conda Environments

The `conda_env` field specifies a conda environment to activate before running the command. This requires `miniforge` (or any conda distribution providing the `conda` binary) in the `requirements` list.

The value can be either:
- **An environment name** (e.g. `myenv`): must match `[a-zA-Z0-9_.-]+`
- **An absolute path** (e.g. `/opt/envs/myenv`): must not contain shell metacharacters

```yaml
name: My Analysis Tool
requirements:
- miniforge
runnables:
- id: analyze
name: Run Analysis
command: python analyze.py
conda_env: my-analysis-env
parameters:
- flag: --input
name: Input
type: file
required: true
```

When `conda_env` is set, the generated script initializes conda and activates the environment before any env vars, pre_run, or the main command:

```bash
eval "$(conda shell.bash hook)"
conda activate my-analysis-env
```

> **Tip:** If the conda environment needs to be created before use (e.g. from an `environment.yml`), use `pre_run` to create it first:
>
> ```yaml
> conda_env: my-tool-env
> pre_run: |
> conda env create -f environment.yml -n my-tool-env --yes 2>/dev/null || true
> ```

### Containers (Apptainer)

The `container` field specifies a container image to run the command inside using [Apptainer](https://apptainer.org/) (formerly Singularity). This requires `apptainer` in the `requirements` list.

The value is a container image URL, typically from a Docker/OCI registry:

- `ghcr.io/org/image:tag` — GitHub Container Registry
- `docker://ghcr.io/org/image:tag` — explicit Docker protocol prefix (added automatically if absent)

```yaml
name: Lolcow
requirements:
- apptainer
runnables:
- id: say
name: Cow Say
command: cowsay
container: godlovedc/lolcow
parameters:
- name: Message
type: string
description: What the cow should say
required: true
default: "Hello from Fileglancer!"
```

When `container` is set, the generated script:

1. Creates a SIF cache directory (defaults to `~/.fileglancer/apptainer_cache/`, configurable in Preferences)
2. Pulls the image to a `.sif` file if not already cached
3. Runs the command inside the container via `apptainer exec`

```bash
# Apptainer container setup
APPTAINER_CACHE_DIR=$HOME/.fileglancer/apptainer_cache
mkdir -p "$APPTAINER_CACHE_DIR"
SIF_PATH="$APPTAINER_CACHE_DIR/godlovedc_lolcow.sif"
if [ ! -f "$SIF_PATH" ]; then
apptainer pull "$SIF_PATH" 'docker://godlovedc/lolcow'
fi
apptainer exec --bind /home/user/.fileglancer/jobs/1-lolcow-say "$SIF_PATH" \
cowsay \
'Hello from Fileglancer!'
```

**Bind mounts** are auto-detected from file and directory parameters. The job's working directory is always bound. Use `bind_paths` to add extra paths:

```yaml
container: ghcr.io/org/image:tag
bind_paths:
- /shared/reference-data
- /scratch
```

**Extra Apptainer arguments** can be set as defaults in the manifest with `container_args`, and overridden by the user at launch time through the UI's Environment tab:

```yaml
container: ghcr.io/org/cuda-tool:latest
container_args: "--nv"
```

> **Important:** `conda_env` and `container` are mutually exclusive — you cannot use both on the same entry point.

## Command Building

When a job is submitted, Fileglancer constructs the full shell command from the runnable's `command` field and the user-provided parameter values using a two-pass approach:
Expand Down Expand Up @@ -314,6 +429,133 @@ When a user submits a job:

Users can view logs, relaunch with the same parameters, or cancel running jobs from the Fileglancer UI.

## Services

A **service** is a long-running process (web server, notebook, API, viewer) that runs until the user explicitly stops it. Services are declared with `type: service` on the runnable:

```yaml
runnables:
- id: notebook
name: JupyterLab
type: service
command: jupyter lab --no-browser --ip=0.0.0.0 --port=0
resources:
cpus: 4
memory: "32 GB"
walltime: "08:00"
```

### How It Works

From the cluster's perspective, a service is just a long-running batch job. The difference is in how Fileglancer communicates the service URL to the user:

1. User launches a service-type runnable → job enters PENDING state
2. Cluster picks it up → RUNNING
3. The service starts, binds a port, and writes its URL to the file at `SERVICE_URL_PATH`
4. On the next poll (every few seconds), Fileglancer reads the file and displays the URL in the UI
5. User clicks "Open Service" → service opens in a new browser tab
6. When done, user clicks "Stop Service" → job is killed and the URL disappears

### Writing the Service URL

For service-type runnables, Fileglancer exports `SERVICE_URL_PATH` — the absolute path to a file where your service should write its URL. Your service must write its URL (e.g. `http://hostname:port`) to this file once it is ready to accept connections.

Example in Python:

```python
import os, socket

url = f"http://{socket.gethostname()}:{port}"
service_url_path = os.environ.get("SERVICE_URL_PATH")
if service_url_path:
with open(service_url_path, "w") as f:
f.write(url)
```

Example in Bash:

```bash
echo "http://$(hostname):${PORT}" > "$SERVICE_URL_PATH"
```

The URL must start with `http://` or `https://`. Fileglancer validates this before displaying it. If the file doesn't exist or contains an invalid URL, no link is shown.

### Service Lifecycle

- **Startup**: The service should write its URL to `SERVICE_URL_PATH` as soon as it is ready. Until the file exists, the UI shows "Service is starting up..."
- **Running**: Fileglancer reads the URL file on each poll. If the URL changes (e.g. port rebind), the UI updates automatically.
- **Shutdown**: When the user clicks "Stop Service", Fileglancer sends a SIGTERM to the job. Services should handle this signal for graceful shutdown. Cleaning up the URL file on exit is good practice but not required — Fileglancer only reads it while the job status is RUNNING.

### Tips

- **Port selection**: Use port 0 or auto-detection to avoid conflicts when multiple services run on the same node
- **Walltime**: Set a generous walltime — services run until stopped, but the cluster will kill them if walltime expires. Consider `"08:00"` or longer for interactive sessions
- **Flush output**: If running under a batch scheduler like LSF, Python's stdout may be buffered. Use `flush=True` on print statements or set `PYTHONUNBUFFERED=1` so logs appear in real time

### Service Example

```yaml
name: My Viewer
description: Interactive data viewer
version: "1.0"

runnables:
- id: view
name: Start Viewer
type: service
description: Launch an interactive viewer for browsing datasets
command: pixi run python start_viewer.py
parameters:
- flag: --data-dir
name: Data Directory
type: directory
description: Directory containing datasets to view
required: true

resources:
cpus: 2
memory: "8 GB"
walltime: "08:00"
```

## Environment Variables Set by Fileglancer

Fileglancer exports the following environment variables in every job script:

| Variable | Availability | Description |
|----------|-------------|-------------|
| `FG_WORK_DIR` | All jobs | Absolute path to the job's working directory (contains `repo/` symlink, log files, etc.) |
| `SERVICE_URL_PATH` | Service-type jobs only | Absolute path where the service should write its URL. Equivalent to `$FG_WORK_DIR/service_url` |

These are available to `pre_run` scripts, the main command, and `post_run` scripts.

## Server Configuration for Apps

Some aspects of app execution are controlled by the Fileglancer server's `config.yaml`, not by individual app manifests. These settings are managed by the system administrator.

### Extra Paths (`extra_paths`)

The `extra_paths` cluster setting lets administrators add directories to `$PATH` for all job scripts. This is useful for making tools like `nextflow`, `pixi`, or `apptainer` available without requiring users to configure their own environments.

```yaml
cluster:
extra_paths:
- /opt/nextflow/bin
- /opt/pixi/bin
- /usr/local/apptainer/bin
```

These paths are:

1. **Appended to `$PATH` in every generated job script** — the user's own `$PATH` entries take precedence
2. **Used when verifying tool requirements** — so `requirements: [nextflow]` can find `/opt/nextflow/bin/nextflow` even if it's not on the server process's default `$PATH`

### Container Cache Directory

By default, Apptainer container images (SIF files) are cached at `~/.fileglancer/apptainer_cache/`. Users can override this per-user in the Preferences page under "Container cache directory".

See the [config.yaml.template](config.yaml.template) for all available cluster settings.

## Full Example

```yaml
Expand Down
7 changes: 5 additions & 2 deletions docs/config.yaml.template
Original file line number Diff line number Diff line change
Expand Up @@ -129,8 +129,8 @@ session_cookie_secure: true
# Mirrors py-cluster-api ClusterConfig - all fields are optional
# See: https://github.com/JaneliaSciComp/py-cluster-api
#
# cluster:
# executor: local # "local" or "lsf"
cluster:
executor: local # "local" or "lsf"
# job_name_prefix: fg # Prefix for cluster job names. REQUIRED for job
# # reconnection after server restarts. Without this,
# # active jobs will not be re-tracked and will
Expand All @@ -149,3 +149,6 @@ session_cookie_secure: true
# - "your_project"
# script_prologue: # Commands to run before each job
# - "module load java/11"
# extra_paths: # Paths appended to $PATH in every job script
# - /opt/nextflow/bin # and used when verifying tool requirements
# - /opt/pixi/bin
2 changes: 1 addition & 1 deletion fileglancer/alembic.ini
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

[alembic]
# path to migration scripts
script_location = alembic
script_location = fileglancer/alembic

# template used to generate migration file names; The default value is %%(rev)s_%%(slug)s
# Uncomment the line below if you want the files to be prepended with date and time
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
"""add container and container_args columns

Revision ID: 483b2e31b85e
Revises: b5e8a3f21c76
Create Date: 2026-02-28 16:39:39.385514

"""
from alembic import op
import sqlalchemy as sa


# revision identifiers, used by Alembic.
revision = '483b2e31b85e'
down_revision = 'b5e8a3f21c76'
branch_labels = None
depends_on = None


def upgrade() -> None:
op.add_column('jobs', sa.Column('container', sa.String(), nullable=True))
op.add_column('jobs', sa.Column('container_args', sa.String(), nullable=True))


def downgrade() -> None:
op.drop_column('jobs', 'container_args')
op.drop_column('jobs', 'container')
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
"""add entry_point_type to jobs

Revision ID: b5e8a3f21c76
Revises: a3f7c2e19d04
Create Date: 2026-02-23 00:00:00.000000

"""
from alembic import op
import sqlalchemy as sa


# revision identifiers, used by Alembic.
revision = 'b5e8a3f21c76'
down_revision = 'a3f7c2e19d04'
branch_labels = None
depends_on = None


def upgrade() -> None:
op.add_column('jobs', sa.Column('entry_point_type', sa.String(), nullable=False, server_default='job'))


def downgrade() -> None:
op.drop_column('jobs', 'entry_point_type')
Loading