Skip to content

Commit 36cbc96

Browse files
committed
Revert "updated readme - limitations, phrasing, formatting"
This reverts commit de724ea. forgot to pull repo
1 parent de724ea commit 36cbc96

1 file changed

Lines changed: 77 additions & 62 deletions

File tree

README.md

Lines changed: 77 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -5,13 +5,13 @@
55
[![python](https://img.shields.io/badge/python-3.10+-blue.svg)](https://www.python.org/download)
66
[![last-commit](https://img.shields.io/github/last-commit/CESNET/tesp-api)]()
77

8-
This is a task execution microservice based on the [TES standard](https://github.com/ga4gh/task-execution-schemas) that sends job executions to [Pulsar](https://github.com/galaxyproject/pulsar) application.
9-
10-
Read about our project on the [Galaxy Hub](https://galaxyproject.org/news/2025-10-06-tesp-api/) and [e-INFRA CZ Blog](https://blog.e-infra.cz/blog/tesp-api/).
11-
12-
This effort is part of the [EuroScienceGateway](https://galaxyproject.org/projects/esg/) project.
13-
For more details on TES, see the [Task Execution Schemas documentation](https://ga4gh.github.io/task-execution-schemas/docs/).
14-
Pulsar is a Python server application that allows a [Galaxy](https://github.com/galaxyproject/galaxy) server to run jobs on remote systems.
8+
This project is an effort to create Open-source implementation of a task execution engine based on the [TES standard](https://github.com/ga4gh/task-execution-schemas)
9+
distributing executions to services exposing [Pulsar](https://github.com/galaxyproject/pulsar) application. For more details
10+
on `TES`, see the Task Execution Schemas [documentation](https://ga4gh.github.io/task-execution-schemas/docs/). `Pulsar`
11+
is a Python server application that allows a [Galaxy](https://github.com/galaxyproject/galaxy) server to run jobs on remote systems. The original intention of this
12+
project was to modify the `Pulsar` project (e.g. via forking) so its Rest API would be compatible with the `TES` standard.
13+
Later a decision was made that rather a separate microservice will be created, decoupled from the `Pulsar`, implementing the `TES`
14+
standard and distributing `TES` tasks execution to `Pulsar` applications.
1515

1616
## Quick start
1717

@@ -22,19 +22,19 @@ The most straightforward way to deploy the TESP is to use Docker Compose.
2222
```
2323
docker compose up -d
2424
```
25-
Starts the API and MongoDB containers. Configure an external Pulsar in `settings.toml`
26-
(default points to `http://localhost:8913`). REST is the default; AMQP is used only
27-
if `pulsar.amqp_url` is set.
25+
Expecting exetrnal Pulsar configured in `settings.toml` before the compose is run.
26+
So far only REST Pulsar communication is supported.
2827

2928
#### With pulsar_rest service:
3029
```
3130
docker compose --profile pulsar up -d
3231
```
33-
Starts a local Pulsar REST container in the same compose network.
3432

33+
<br />
34+
<br />
3535
<br />
3636

37-
Depending on your Docker and Docker Compose installation, you may need to use `docker-compose` (with hyphen) instead.
37+
Depending on you Docker and Docker Compose installation, you may need to use `docker-compose` (with hyphen) instead.
3838

3939
You might encounter a timeout error in container runtime which can be solved by correct `mtu` configuration either in the `docker-compose.yaml`:
4040
```
@@ -47,19 +47,17 @@ networks:
4747
or directly in your `/etc/docker/daemon.json`:
4848
```
4949
{
50-
"mtu": 1442
50+
"mtu": 1442
5151
}
5252
```
5353

54-
The Data Transfer Services (HTTP/S3/FTP) are defined in [docker/dts](docker/dts/README.md)
55-
and run via a separate compose file.
54+
The `docker-compose.yaml` spins also collection of [Data Transfer Services](docker/dts/README.md) which can be used for testing.
5655

57-
&nbsp;
5856
### Usage
5957
If the TESP is running, you can try to submit a task. One way is to use cURL. Although the project is still in development, the TESP should be compatible with TES so you can try TES clients such as Snakemake or Nextflow. The example below shows how to submit task using cURL.
6058

6159
#### 1. Create JSON file
62-
The first step you need to take is to prepare JSON file with the task. For inspiration you can use [tests/test_jsons](tests/test_jsons) located in this repository, or [TES documentation](https://ga4gh.github.io/task-execution-schemas/docs/).
60+
The first step you need to take is to prepare JSON file with the task. For inspiration you can use [tests](https://github.com/CESNET/tesp-api/tree/dev/tests/test_jsons) located in this repository, or [TES documentation](https://ga4gh.github.io/task-execution-schemas/docs/).
6361

6462
Example JSON file:
6563
```
@@ -91,10 +89,10 @@ Please check the URL of the running TES and the file with the task you just crea
9189
curl http://localhost:8080/v1/tasks -X POST -H "Content-Type: application/json" -d $(sed -e "s/ //g" example.json | tr -d '\n')
9290
```
9391
(The only reason for the subshell is to remove whitespaces and newlines.)
94-
After the task is submitted, the endpoint returns the task ID. This is useful to check the task status.
92+
After the task is submitted, the endpoint returns the task ID. This is usefull to check the task status.
9593

9694
#### 3. Check the task status
97-
There are more useful endpoints to check the task status.
95+
There are more usefull endpoints to check the task status.
9896

9997
List all tasks:
10098
```
@@ -132,9 +130,8 @@ instead of starting the project locally without `docker`. In that case only thos
132130
| poetry | 1.1.13+ | _pip install poetry_ |
133131
| mongodb | 4.4+ | _docker-compose uses latest_ |
134132
| pulsar | 0.14.13 | _actively trying to support latest. Must have access to docker with the same host as pulsar application itself_ |
135-
| ftp server | - | _optional for I/O testing. The [docker/dts](docker/dts/README.md) stack provides FTP/S3/HTTP services_. |
133+
| ftp server | - | _no real recommendation here. docker-compose uses [ftpserver](https://github.com/fclairamb/ftpserver) so local alternative should support same fpt commands_. |
136134

137-
&nbsp;
138135
### Configuring TESP API
139136
`TESP API` uses [dynaconf](https://www.dynaconf.com/) for its configuration. Configuration is currently set up by using
140137
[./settings.toml](https://github.com/CESNET/tesp-api/blob/main/settings.toml) file. This file declares sections which represent different environments for `TESP API`. Default section
@@ -156,25 +153,13 @@ To apply different environment (i.e. to switch which section will be picked by `
156153
`FASTAPI_PROFILE` must be set to the concrete name of such section (e.g. `FASTAPI_PROFILE=dev-docker` which can be seen
157154
in the [./docker/tesp_api/Dockerfile](https://github.com/CESNET/tesp-api/blob/main/docker/tesp_api/Dockerfile))
158155

159-
&nbsp;
160-
### Authentication
161-
`TESP API` can run without authentication (default). To enable Basic Auth, set `basic_auth.enable = true`
162-
and configure `basic_auth.username` and `basic_auth.password` in `settings.toml`. To enable OAuth2,
163-
set `oauth.enable = true` and pass a Bearer token; the token is validated via the issuer in its `iss`
164-
claim using OIDC discovery.
165-
166-
Container execution runtime is controlled by the `CONTAINER_TYPE` environment variable (`docker` or
167-
`singularity`). The default is `docker`.
168-
169-
&nbsp;
170156
### Configuring required services
171157
You can have a look at [./docker-compose.yaml](https://github.com/CESNET/tesp-api/blob/main/docker-compose.yaml) to see how
172158
the infrastructure for development should look like. Of course, you can configure those services in your preferred way if you are
173159
going to start the project without `docker` or if you are trying to create other than `development` environment but some things
174-
must remain as they are. For example, `TESP API` currently communicates with `Pulsar` via REST by default; configure Pulsar for
175-
REST unless you set `pulsar.amqp_url` to enable AMQP.
160+
must remain as they are. For example, `TESP API` currently supports communication with `Pulsar` only through its Rest API and
161+
therefore `Pulsar` must be configured in such a way.
176162

177-
&nbsp;
178163
### Current Docker services
179164
All the current `Docker` services which will be used when the project is started with `docker-compose` have common directory
180165
[./docker](https://github.com/CESNET/tesp-api/tree/main/docker) for configurations, data, logs and Dockerfiles if required.
@@ -184,12 +169,15 @@ example trying to create data folder for given service. Such issues should be re
184169
which ports to be used etc. Following services are currently defined by [./docker-compose.yaml](https://github.com/CESNET/tesp-api/blob/main/docker-compose.yaml)
185170
- **tesp-api** - This project itself. Depends on mongodb
186171
- **tesp-db** - [MongoDB](https://www.mongodb.com/) instance for persistence layer
187-
- **pulsar_rest** - `Pulsar` configured to use REST API with access to a docker instance thanks to [DIND](https://hub.docker.com/_/docker) (enabled with `--profile pulsar`).
172+
- **pulsar_rest** - `Pulsar` configured to use Rest API with access to a docker instance thanks to [DIND](https://hub.docker.com/_/docker).
173+
- **pulsar_amqp** - currently disabled, will be used in the future development
174+
- **ftpserver** - online storage for `TES` tasks input/output content
175+
- **minio** - currently acting only as a storage backend for the `ftpserver` with simple web interface to access data.
188176

189-
If you want HTTP/FTP/S3 data transfer services for testing, use the separate
190-
[docker/dts](docker/dts/README.md) compose stack.
177+
**Folder [./docker/minio/initial_data](https://github.com/CESNET/tesp-api/tree/main/docker/minio/initial_data) contains startup
178+
folders for `minio` service which must be copied to the `./docker/minio/data` folder before starting up the infrastructure. Those data
179+
configure `minio` to start with already created bucket and user which will be used by `ftpserver` for access.**
191180

192-
&nbsp;
193181
### Run the project
194182
This project uses [Poetry](https://python-poetry.org/) for `dependency management` and `packaging`. `Poetry` makes it easy
195183
to install libraries required by `TESP API`. It uses [./pyproject.toml](https://github.com/CESNET/tesp-api/blob/feature/TESP-0-github-proper-readme/pyproject.toml)
@@ -222,37 +210,34 @@ initialized properly or whether any errors occurred.
222210
- **http://localhost:8080/** - will redirect to Swagger documentation of `TESP API`. This endpoint also currently acts as a frontend.
223211
You can use it to execute REST based calls expected by the `TESP API`. Swagger is automatically generated from the sources,
224212
and therefore it corresponds to the very current state of the `TESP API` interface.
225-
- If you run the DTS stack from [docker/dts](docker/dts/README.md), MinIO console is available at
226-
**http://localhost:9001/** with `root` / `123456789` credentials.
213+
- **http://localhost:40949/** - `minio` web interface. Use `admin` and `!Password123` credentials to login. Make sure
214+
that bucket `tesp-ftp` is already present, otherwise see [Current Docker services](#current-docker-services) section of this readme to properly
215+
prepare infrastructure before the startup.
227216

228217
### Executing simple TES task
229218
This section will demonstrate execution of simple `TES` task which will calculate _[md5sum](https://en.wikipedia.org/wiki/Md5sum)_
230219
hash of given input. There are more approaches of how I/O can be handled by `TES` but main goal here is to demonstrate `ftp server` as well.
231220

232-
If you want to use the bundled HTTP/FTP/S3 services, start the DTS stack in [docker/dts](docker/dts/README.md)
233-
and adapt hostnames/ports to match your network setup.
234-
235-
1. Upload a new file with your preferred name and content (e.g. name `holy_file` and content `Hello World!`) to your
236-
FTP-backed storage. If you run the DTS stack, use the MinIO console at **http://localhost:9001/** to create a bucket
237-
and upload the file. This file will be accessible through your FTP service and will be used as an input file for this
238-
demonstration.
221+
1. Head over to **http://localhost:40949/buckets/tesp-ftp/browse** and upload a new file with your preferred name and content (e.g. name
222+
`holy_file` and content `Hello World!`). This file will now be accessible trough `ftpserver` service and will be used as
223+
an input file for this demonstration.
239224
2. Go to **http://localhost:8080/** and use `POST /v1/tasks` request to create following `TES` task (task is sent in the request body).
240-
In the `"inputs.url"` replace `<file_uploaded_to_storage>` with the file name you chose in the previous step. If http status of
225+
In the `"inputs.url"` replace `<file_uploaded_to_minio>` with the file name you chose in the previous step. If http status of
241226
returned response is 200, the response will contain `id` of created task in the response body which will be used to
242227
reference this task later on.
243228
```json
244229
{
245230
"inputs": [
246231
{
247-
"url": "ftp://<ftp_host>:2121/<file_uploaded_to_storage>",
232+
"url": "ftp://ftpserver:2121/<file_uploaded_to_minio>",
248233
"path": "/data/file1",
249234
"type": "FILE"
250235
}
251236
],
252237
"outputs": [
253238
{
254239
"path": "/data/outfile",
255-
"url": "ftp://<ftp_host>:2121/outfile-1",
240+
"url": "ftp://ftpserver:2121/outfile-1",
256241
"type": "FILE"
257242
}
258243
],
@@ -274,27 +259,57 @@ previous step. This request also supports `view` query parameter which can be us
274259
set to the state `COMPLETE` or one of the error states. In case of an error state, depending on its type, the error will be part
275260
of the task logs in the response (use `FULL` view), or you can inspect the logs of `TESP API` service, where error should be logged with respective
276261
message.
277-
4. Once the task completes you can check your FTP-backed storage (for the DTS stack, use the MinIO console at
278-
**http://localhost:9001/**) where you should find uploaded `outfile-1` with output content of executed
279-
_[md5sum](https://en.wikipedia.org/wiki/Md5sum)_. You can play around
262+
4. Once the task completes you can head over back to **http://localhost:40949/buckets/tesp-ftp/browse** where you should find
263+
uploaded `outfile-1` with output content of executed _[md5sum](https://en.wikipedia.org/wiki/Md5sum)_. You can play around
280264
by creating different tasks, just be sure to only use functionality which is currently supported - see [Known limitations](#known-limitations).
281265
For example, you can omit `inputs.url` and instead use `inputs.content` which allows you to create input in place, or you can also
282266
omit `outputs` and `executors.stdout` in which case the output will be present in the `logs.logs.stdout` as executor is no
283267
longer configured to redirect stdout into the file.
284268

285-
&nbsp;
286269
### Known limitations of TESP API
287270
| Domain | Limitation |
288271
|----------|:-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
272+
| _Pulsar_ | `TESP API` communicates with `Pulsar` only through its REST API, missing functionality for message queues |
289273
| _Pulsar_ | `TESP API` should be able to dispatch executions to multiple `Pulsar` services via different types of `Pulsar` interfaces. Currently, only one `Pulsar` service is supported |
290274
| _Pulsar_ | `Pulsar` must be "polled" for job state. Preferably `Pulsar` should notify `TESP API` about state change. This is already default behavior when using `Pulsar` with message queues |
291-
| _TES_ | Canceling a `TES` task calls Pulsar's cancel endpoint but container termination depends on Pulsar/runtime behavior. In-flight tasks may still complete. |
292-
| _TES_ | Only `cpu_cores` and `ram_gb` are mapped to container runtime flags. Other resource fields (disk, preemptible, zones) are stored but not enforced. |
293-
| _TES_ | Task `tags` are accepted and stored but not used by the scheduler or runtime. |
294-
| _TES_ | Task `logs.outputs` is not populated. Use `outputs` to persist result files. |
275+
| _TES_ | Canceling `TES` task does not immediately stop the task. Task even cannot be canceled while it is running. |
276+
| _TES_ | `TES` does not state specific urls to be supported for file transfer (e.g. tasks `inputs.url`). Only FTP is supported for now |
277+
| _TES_ | tasks `inputs.type` and `outputs.type` can be either DIRECTORY or FILE. Only FILE is supported, DIRECTORY will lead to undefined behavior for now |
278+
| _TES_ | tasks `resources` currently do not change execution behavior in any way. This configuration will take effect once `Pulsar` limitations are resolved |
279+
| _TES_ | tasks `executors.workdir` and `executors.env` functionality is not yet implemented. You can use them but they will have no effect |
280+
| _TES_ | tasks `volumes` and `tags` functionality is not yet implemented. You use them but they will have no effect |
281+
| _TES_ | tasks `logs.outputs` functionality is not yet implemented. However this limitation can be bypassed with tasks `outputs` |
295282

296283
&nbsp;
284+
## GIT
285+
Current main branch is `origin/main`. This happens to be also a release branch for now. Developers should typically derive their
286+
own feature branches such as e.g. `feature/TESP-111-task-monitoring`. This project has not yet configured any CI/CD. Releases are
287+
done manually by creating a tag in the current release branch. There is not yet configured any issue tracking software but for
288+
any possible future integration this project should reference commits, branches PR's etc ... with prefix `TESP-0` as a reference
289+
to a work that has been done before such integration. Pull request should be merged using `Squash and merge` option with message format `Merge pull request #<PRnum> from <branch-name>`.
290+
Since there is no CI/CD setup this is only opinionated view on how branching policies should work and for now everything is possible.
297291

298-
History note: _The original intention of this project was to modify the `Pulsar` project so its Rest API would be compatible with the `TES` standard.
299-
Later a decision was made that rather a separate microservice will be created, decoupled from the `Pulsar`, implementing the `TES`
300-
standard and distributing `TES` tasks execution to `Pulsar` applications._
292+
## License
293+
294+
[![license](https://img.shields.io/github/license/CESNET/tesp-api)](https://github.com/CESNET/tesp-api/blob/main/LICENSE.md)
295+
```
296+
Copyright (c) 2022 Norbert Dopjera
297+
298+
Permission is hereby granted, free of charge, to any person obtaining a copy
299+
of this software and associated documentation files (the "Software"), to deal
300+
in the Software without restriction, including without limitation the rights
301+
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
302+
copies of the Software, and to permit persons to whom the Software is
303+
furnished to do so, subject to the following conditions:
304+
305+
The above copyright notice and this permission notice shall be included in all
306+
copies or substantial portions of the Software.
307+
308+
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
309+
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
310+
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
311+
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
312+
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
313+
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
314+
SOFTWARE.
315+
```

0 commit comments

Comments
 (0)