Skip to content

[ENH] V1 -> V2 Migration - Flows (module)#1609

Open
Omswastik-11 wants to merge 296 commits intoopenml:mainfrom
Omswastik-11:flow-migration-stacked
Open

[ENH] V1 -> V2 Migration - Flows (module)#1609
Omswastik-11 wants to merge 296 commits intoopenml:mainfrom
Omswastik-11:flow-migration-stacked

Conversation

@Omswastik-11
Copy link
Contributor

@Omswastik-11 Omswastik-11 commented Jan 8, 2026

Fixes #1601

added a Create method in FlowAPI for publishing flow but not refactored with old publish . (Needs discussion on this)
Added tests using fake_methods so that we can test without local V2 server . I have tested the FlowsV2 methods (get and exists ) and delete and list were not implemented in V2 server so I skipped them .

@Omswastik-11 Omswastik-11 changed the title [ENH] V1 -> V2 Migration [ENH] V1 -> V2 Migration - Flows (module) Jan 8, 2026
@Omswastik-11 Omswastik-11 marked this pull request as ready for review January 8, 2026 15:09
@geetu040 geetu040 mentioned this pull request Jan 9, 2026
18 tasks
@codecov-commenter
Copy link

codecov-commenter commented Jan 12, 2026

Codecov Report

❌ Patch coverage is 86.72566% with 15 lines in your changes missing coverage. Please review.
✅ Project coverage is 81.60%. Comparing base (8ff14eb) to head (b89a084).

Files with missing lines Patch % Lines
openml/_api/resources/flow.py 86.51% 12 Missing ⚠️
openml/flows/flow.py 85.71% 2 Missing ⚠️
openml/flows/functions.py 90.00% 1 Missing ⚠️
Additional details and impacted files
@@             Coverage Diff             @@
##             main    #1609       +/-   ##
===========================================
+ Coverage   53.96%   81.60%   +27.63%     
===========================================
  Files          61       61               
  Lines        5051     5094       +43     
===========================================
+ Hits         2726     4157     +1431     
+ Misses       2325      937     -1388     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Collaborator

@geetu040 geetu040 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nicely done overall.

@Omswastik-11 Omswastik-11 force-pushed the flow-migration-stacked branch from 614411f to 36184e5 Compare January 14, 2026 14:56
@Omswastik-11 Omswastik-11 requested a review from geetu040 January 14, 2026 15:10
@Omswastik-11
Copy link
Contributor Author

Hi @geetu040 !! Can you review the pre-commit failure It was due to merge conflicts more specifically for tasks . Should I change it on my branch ?

@geetu040
Copy link
Collaborator

Hi @geetu040 !! Can you review the pre-commit failure It was due to merge conflicts more specifically for tasks . Should I change it on my branch ?

can you try again, sync your branch with mine? It should be fixed now.

@Omswastik-11
Copy link
Contributor Author

I think that due to conflicts it did not synced properly . I have to revert it manually

Copy link
Collaborator

@geetu040 geetu040 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice overall, few changes needed. I'll look at the tests later when the implementation is final.

@Omswastik-11 Omswastik-11 requested a review from geetu040 January 21, 2026 18:25
Copy link
Collaborator

@geetu040 geetu040 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update with #1576 (comment)

Copilot AI review requested due to automatic review settings March 16, 2026 14:24
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 42 out of 43 changed files in this pull request and generated 5 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 46 to 48
def _mocked_perform_api_call(call, request_method):
url = openml.config.server + "/" + call
url = openml.config.server + call
return openml._api_calls._download_text_file(url)
Comment on lines +455 to 458
@mock.patch.object(requests.Session, "request")
def test_delete_flow_not_owned(mock_request, test_files_directory, test_apikey_v1):
openml.config.start_using_configuration_for_example()
content_file = test_files_directory / "mock_responses" / "flows" / "flow_delete_not_owned.xml"
Comment on lines 468 to +471
openml.flows.delete_flow(40_000)

flow_url = f"{openml.config.TEST_SERVER_URL}/api/v1/xml/flow/40000"
assert flow_url == mock_delete.call_args.args[0]
assert test_api_key == mock_delete.call_args.kwargs.get("params", {}).get("api_key")
assert mock_request.call_args.kwargs.get("method") == "DELETE"
assert test_apikey_v1 == mock_request.call_args.kwargs.get("params", {}).get("api_key")
Comment on lines +294 to +302
raise ValueError(
f'invalid api_version="{api_version}" '
f"allowed versions: {', '.join(list(APIVersion))}"
)

if fallback_api_version is not None and fallback_api_version not in APIVersion:
raise ValueError(
f'invalid fallback_api_version="{fallback_api_version}" '
f"allowed versions: {', '.join(list(APIVersion))}"
Comment on lines 299 to +310
TestBase.logger.info(f"collected from {__file__.split('/')[-1]}: {flow.flow_id}")


@pytest.mark.sklearn()
@mock.patch("openml.flows.functions.get_flow")
@mock.patch("openml.flows.functions.flow_exists")
@mock.patch("openml._api_calls._perform_api_call")
def test_publish_error(self, api_call_mock, flow_exists_mock, get_flow_mock):
@mock.patch("requests.Session.request")
def test_publish_error(self, mock_request, flow_exists_mock, get_flow_mock):
model = sklearn.ensemble.RandomForestClassifier()
flow = self.extension.model_to_flow(model)
api_call_mock.return_value = (

# Create mock response directly
Copy link
Collaborator

@geetu040 geetu040 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The last sync with base-pr was not clean. Some changes were still there from the previous commits that caused the test failures, though it showed another bug in the base PR which I've fixed now.
Please update this PR with the base PR, make sure the merge is clean and the CI is green.

Copilot AI review requested due to automatic review settings March 23, 2026 18:45
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 8 out of 8 changed files in this pull request and generated 9 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +64 to +65
data = {"name": name, "external_version": external_version, "api_key": self._http.api_key}
xml_response = self._http.post("flow/exists", data=data).text
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FlowV1API.exists() calls HTTPClient.post() with the default use_api_key=True, which raises OpenMLAuthenticationError when no API key is configured. The legacy flow/exists endpoint worked without an API key (it only included api_key when present), so this is a behavior regression for anonymous users. Consider calling self._http.post(..., use_api_key=False) and omit api_key from the payload (or make auth optional).

Suggested change
data = {"name": name, "external_version": external_version, "api_key": self._http.api_key}
xml_response = self._http.post("flow/exists", data=data).text
data = {"name": name, "external_version": external_version}
if self._http.api_key:
data["api_key"] = self._http.api_key
xml_response = self._http.post("flow/exists", data=data, use_api_key=False).text

Copilot uses AI. Check for mistakes.
Comment on lines +210 to +217
if not (isinstance(name, str) and len(name) > 0):
raise ValueError("Argument 'name' should be a non-empty string")
if not (isinstance(external_version, str) and len(external_version) > 0):
raise ValueError("Argument 'version' should be a non-empty string")

try:
response = self._http.get(f"flows/exists/{name}/{external_version}/")
result = response.json()
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FlowV2API.exists() interpolates name and external_version directly into the URL path. These values can contain characters that require URL escaping, which can break requests or cause ambiguous routing. Quote/escape each path segment (e.g., urllib.parse.quote) or send them as query parameters if supported by the v2 endpoint.

Copilot uses AI. Check for mistakes.
import xmltodict

import openml
import openml._api_calls
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

openml._api_calls is imported but no longer used in this module after switching to the new backend APIs. This will fail linting/static checks; please remove the unused import.

Suggested change
import openml._api_calls

Copilot uses AI. Check for mistakes.
from openml._api_calls import _perform_api_call
from openml.testing import SimpleImputer, TestBase
import requests
from openml.testing import SimpleImputer, TestBase, create_request_response
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unused import: create_request_response is imported but not used anywhere in this test file. Please remove it to keep the test module clean and avoid lint failures.

Suggested change
from openml.testing import SimpleImputer, TestBase, create_request_response
from openml.testing import SimpleImputer, TestBase

Copilot uses AI. Check for mistakes.
Comment on lines +92 to +105
mock_request.assert_called_once_with(
method="POST",
url=openml.config.server + "flow/exists",
params={},
data={
"name": flow_name,
"external_version": external_version,
"api_key": test_apikey_v1,
},
headers=openml.config._HEADERS,
files=None,
)


Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test asserts that FlowV1API.exists() sends api_key inside the POST body. If FlowV1API.exists() is adjusted to not require authentication (and rely on HTTPClient to include api_key only when configured), this assertion will become incorrect. Consider relaxing the assertion to not require api_key for the exists call (or explicitly set use_api_key=False in the implementation and test for that behavior).

Suggested change
mock_request.assert_called_once_with(
method="POST",
url=openml.config.server + "flow/exists",
params={},
data={
"name": flow_name,
"external_version": external_version,
"api_key": test_apikey_v1,
},
headers=openml.config._HEADERS,
files=None,
)
mock_request.assert_called_once()
_, kwargs = mock_request.call_args
assert kwargs["method"] == "POST"
assert kwargs["url"] == openml.config.server + "flow/exists"
assert kwargs["params"] == {}
assert kwargs["headers"] == openml.config._HEADERS
assert kwargs["files"] is None
assert kwargs["data"]["name"] == flow_name
assert kwargs["data"]["external_version"] == external_version

Copilot uses AI. Check for mistakes.
Comment on lines +47 to +50
ignore_cache : bool, default=False
Whether to ignore the cache. If ``true`` this will download and overwrite the flow xml
even if the requested flow is already cached.

Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring around ignore_cache/caching behavior still refers to downloading/overwriting a flow.xml file in cache_directory/flows/{id}. Since get_flow() now delegates to HTTPClient caching (request-path-based cache entries), please update the docstring wording to match the new cache mechanism (or avoid specifying the on-disk layout).

Copilot uses AI. Check for mistakes.
flow_exists_mock.return_value = False
).encode()
mock_request.return_value = response
flow_exists_mock.return_value = None # Flow doesn't exist yet, so try to publish
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flow_exists() is documented/typed to return int | bool (with False indicating non-existence). This test sets flow_exists_mock.return_value = None, which is outside the contract and can mask type/logic issues. Please return False here to match the real API behavior and keep the test faithful to production semantics.

Suggested change
flow_exists_mock.return_value = None # Flow doesn't exist yet, so try to publish
flow_exists_mock.return_value = False # Flow doesn't exist yet, so try to publish

Copilot uses AI. Check for mistakes.
>>> openml.flows.delete_flow(23) # doctest: +SKIP
"""
return openml.utils._delete_entity("flow", flow_id)
return openml._backend.flow.delete(flow_id) # type: ignore
Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This # type: ignore on the return from openml._backend.flow.delete(flow_id) looks unnecessary now that FlowAPI inherits delete() from ResourceAPI. If type checking is complaining, it may be better to fix the underlying type (e.g., ensure FlowAPI exposes delete with the right signature) rather than suppressing it here.

Suggested change
return openml._backend.flow.delete(flow_id) # type: ignore
return openml._backend.flow.delete(flow_id)

Copilot uses AI. Check for mistakes.
Comment on lines 17 to 19
FLOWS_CACHE_DIR_NAME = "flows"


Copy link

Copilot AI Mar 23, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FLOWS_CACHE_DIR_NAME is now unused after removing the filesystem flow cache helpers. Please remove it (and any related unused imports) to avoid lint/static-analysis failures.

Suggested change
FLOWS_CACHE_DIR_NAME = "flows"

Copilot uses AI. Check for mistakes.
Comment on lines +297 to +304
code = server_error.get("code")
message = server_error.get("message")
additional_information = server_error.get("additional_information")
if isinstance(server_error, dict):
code = server_error.get("code")
message = server_error.get("message")
additional_information = server_error.get("additional_information")
else:
code = None
message = str(server_error)
additional_information = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what are these changes? I think the merge was not clean. Please sync with main

Copilot AI review requested due to automatic review settings March 24, 2026 17:10
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 7 out of 7 changed files in this pull request and generated 12 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +220 to +222
except (OpenMLServerError, KeyError):
# v2 returns 404 when flow doesn't exist, which raises OpenMLServerError
return False
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FlowV2API.exists() currently catches OpenMLServerError and returns False, which will also hide real failures (e.g., authentication/authorization problems, 500s) by reporting "flow does not exist". Only treat the specific "not found" case as non-existence (e.g., catch OpenMLServerNoResult / an OpenMLServerException with the v2 not-found code or 404) and re-raise all other exceptions.

Suggested change
except (OpenMLServerError, KeyError):
# v2 returns 404 when flow doesn't exist, which raises OpenMLServerError
return False
except OpenMLServerError as err:
# v2 returns HTTP 404 when the flow doesn't exist; only treat that case
# as "not found" and propagate all other server errors.
if getattr(err, "code", None) == 404:
return False
raise

Copilot uses AI. Check for mistakes.
Comment on lines +497 to +499
if self.flow_id is None:
raise ValueError("Flow does not have an ID. Please publish the flow before untagging.")
openml._backend.flow.untag(self.flow_id, tag)
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same as push_tag(): raising ValueError for an unpublished flow changes the public exception contract compared to OpenMLBase.remove_tag() (which raises ObjectNotPublishedError). Align the exception type/message (or delegate to the base implementation) for consistency across entities.

Copilot uses AI. Check for mistakes.
flow_exists_mock.return_value = False
).encode()
mock_request.return_value = response
flow_exists_mock.return_value = None # Flow doesn't exist yet, so try to publish
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flow_exists_mock is typed/used as int | bool across the codebase; returning None here can mask type/logic issues (e.g., code paths that explicitly check is False). Prefer returning False to represent non-existence.

Suggested change
flow_exists_mock.return_value = None # Flow doesn't exist yet, so try to publish
flow_exists_mock.return_value = False # Flow doesn't exist yet, so try to publish

Copilot uses AI. Check for mistakes.
Comment on lines +457 to 458
openml.config.start_using_configuration_for_example()
content_file = test_files_directory / "mock_responses" / "flows" / "flow_delete_not_owned.xml"
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

start_using_configuration_for_example() emits a warning and mutates global config state; this test suite already uses the with_server autouse fixture to switch to test servers. Consider removing this call (and the other occurrences below) to avoid unnecessary warnings/global state changes, or ensure you restore the prior config via stop_using_configuration_for_example() in a finally block.

Copilot uses AI. Check for mistakes.
def test_delete_flow_with_run(mock_delete, test_files_directory, test_server_v1, test_apikey_v1):
@mock.patch.object(requests.Session, "request")
def test_delete_flow_with_run(mock_request, test_files_directory, test_apikey_v1):
openml.config.start_using_configuration_for_example()
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above: start_using_configuration_for_example() is redundant here (given the autouse server fixtures) and adds warnings/global state mutation. Prefer relying on fixtures or restoring config after the test.

Copilot uses AI. Check for mistakes.
):
flow_v2.publish(path="flow", files={"description": "<flow/>"})


Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

flow_v1.get(...) here triggers a real HTTP request (and the test output depends on live server state). Please mark this as @pytest.mark.test_server() or mock the V1 GET response so this comparison test is deterministic/offline.

Suggested change
@pytest.mark.test_server()

Copilot uses AI. Check for mistakes.
Comment on lines +485 to +488
if self.flow_id is None:
raise ValueError("Flow does not have an ID. Please publish the flow before tagging.")
openml._backend.flow.tag(self.flow_id, tag)

Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OpenMLBase.push_tag() raises ObjectNotPublishedError when the entity has no server id. This override raises ValueError instead, which is an observable API behavior change for flows. Consider preserving the exception type/message (e.g., raise openml.exceptions.ObjectNotPublishedError here, or delegate to the base implementation) so tagging an unpublished flow behaves consistently with other OpenML entities.

Copilot uses AI. Check for mistakes.
def test_delete_subflow(mock_delete, test_files_directory, test_server_v1, test_apikey_v1):
@mock.patch.object(requests.Session, "request")
def test_delete_subflow(mock_request, test_files_directory, test_apikey_v1):
openml.config.start_using_configuration_for_example()
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above: start_using_configuration_for_example() is redundant here (given the autouse server fixtures) and adds warnings/global state mutation. Prefer relying on fixtures or restoring config after the test.

Copilot uses AI. Check for mistakes.
def test_delete_flow_success(mock_delete, test_files_directory, test_server_v1, test_apikey_v1):
@mock.patch.object(requests.Session, "request")
def test_delete_flow_success(mock_request, test_files_directory, test_apikey_v1):
openml.config.start_using_configuration_for_example()
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above: start_using_configuration_for_example() is redundant here (given the autouse server fixtures) and adds warnings/global state mutation. Prefer relying on fixtures or restoring config after the test.

Copilot uses AI. Check for mistakes.
@pytest.mark.xfail(reason="failures_issue_1544", strict=False)
def test_delete_unknown_flow(mock_delete, test_files_directory, test_server_v1, test_apikey_v1):
def test_delete_unknown_flow(mock_request, test_files_directory, test_apikey_v1):
openml.config.start_using_configuration_for_example()
Copy link

Copilot AI Mar 24, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue as above: start_using_configuration_for_example() is redundant here (given the autouse server fixtures) and adds warnings/global state mutation. Prefer relying on fixtures or restoring config after the test.

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[ENH] V1 → V2 API Migration - flows

7 participants