Conversation
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
…into issue1564
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #1616 +/- ##
==========================================
- Coverage 53.96% 53.11% -0.86%
==========================================
Files 61 62 +1
Lines 5051 5202 +151
==========================================
+ Hits 2726 2763 +37
- Misses 2325 2439 +114 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Signed-off-by: Omswastik-11 <omswastikpanda11@gmail.com>
…-11/openml-python into runs-migration-stacked
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 42 out of 43 changed files in this pull request and generated 9 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| from __future__ import annotations | ||
|
|
||
| import hashlib | ||
| import json | ||
| import logging | ||
| import math |
| @pytest.fixture | ||
| def dummy_task_v2(http_client_v2, minio_client) -> DummyTaskV1API: | ||
| return DummyTaskV2API(http=http_client_v2, minio=minio_client) | ||
|
|
||
|
|
||
| @pytest.fixture | ||
| def dummy_task_fallback(dummy_task_v1, dummy_task_v2) -> DummyTaskV1API: | ||
| return FallbackProxy(dummy_task_v2, dummy_task_v1) | ||
|
|
| def _mocked_perform_api_call(call, request_method): | ||
| url = openml.config.server + "/" + call | ||
| url = openml.config.server + call | ||
| return openml._api_calls._download_text_file(url) |
| openml.config._set_servers(mode) | ||
| openml.config.set_api_version(api_version) | ||
| openml.config.server = "temp-server1" | ||
| openml.config.apikey = "temp-apikey1" | ||
| openml.config._get_servers(mode)["server"] = 'temp-server2' | ||
| openml.config._get_servers(mode)["apikey"] = 'temp-server2' | ||
|
|
| from __future__ import annotations | ||
|
|
||
| from enum import Enum | ||
|
|
||
|
|
||
| class APIVersion(str, Enum): |
| from urllib.parse import urljoin | ||
|
|
| from urllib.parse import urlparse | ||
|
|
| f"allowed versions: {', '.join(list(APIVersion))}" | ||
| ) | ||
|
|
||
| if fallback_api_version is not None and fallback_api_version not in APIVersion: | ||
| raise ValueError( | ||
| f'invalid fallback_api_version="{fallback_api_version}" ' | ||
| f"allowed versions: {', '.join(list(APIVersion))}" |
| attr = getattr(api, name, None) | ||
| if attr is not None: | ||
| return api, attr |
There was a problem hiding this comment.
Please take a look at the CI failures, they look genuine. I think you need to update the tests to use the expected content from local database.
I think this is the reason: #1611 (review)
…into runs-migration-stacked
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 44 out of 47 changed files in this pull request and generated 6 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| from urllib.parse import urlparse | ||
|
|
There was a problem hiding this comment.
urlparse is imported but never used in this test module. This will fail linting in setups that enforce unused imports; please remove it or use it.
| def test_get_production_servers(): | ||
| assert openml.config.get_production_servers() == openml.config._get_servers("production") | ||
|
|
||
|
|
||
| def test_get_test_servers(): | ||
| assert openml.config.get_test_servers() == openml.config._get_servers("test") | ||
|
|
||
|
|
||
| def test_use_production_servers(): | ||
| openml.config.use_production_servers() | ||
| servers_1 = openml.config.servers | ||
|
|
||
| openml.config._set_servers("production") | ||
| servers_2 = openml.config.servers | ||
|
|
||
| assert servers_1 == servers_2 | ||
|
|
||
|
|
||
| def test_use_test_servers(): | ||
| openml.config.use_test_servers() | ||
| servers_1 = openml.config.servers | ||
|
|
||
| openml.config._set_servers("test") | ||
| servers_2 = openml.config.servers |
There was a problem hiding this comment.
These tests pass string literals ("production"/"test") into _get_servers/_set_servers, but those methods are typed/implemented to take ServerMode and will currently raise (or depend on implicit coercion). Use ServerMode.PRODUCTION/ServerMode.TEST here (or update the implementation to accept strings consistently).
| V2 server does not yet support POST /runs/ endpoint. | ||
| Expected availability: Q2 2025 |
There was a problem hiding this comment.
The docstring says the V2 run publish endpoint is expected in “Q2 2025”, which is now in the past. Please update/remove the date to avoid misleading users (e.g., point to an issue/roadmap link instead of a time estimate).
| V2 server does not yet support POST /runs/ endpoint. | |
| Expected availability: Q2 2025 | |
| V2 server does not yet support the POST /runs/ endpoint. | |
| See the OpenML roadmap or issue tracker for the current status. |
| def _get_servers(mode: ServerMode) -> dict[APIVersion, dict[str, str | None]]: | ||
| if mode not in ServerMode: | ||
| raise ValueError(f'invalid mode="{mode}" allowed modes: {", ".join(list(ServerMode))}') | ||
| return deepcopy(_SERVERS_REGISTRY[mode]) |
There was a problem hiding this comment.
_get_servers only accepts ServerMode enum values, but it’s being used with strings (e.g., "production"/"test") in the tests and will raise ValueError because mode not in ServerMode for a plain string. Consider accepting str | ServerMode here (e.g., coerce via ServerMode(mode) when isinstance(mode, str)), or update all call sites to pass ServerMode.PRODUCTION/ServerMode.TEST consistently.
| def _get_servers(mode: ServerMode) -> dict[APIVersion, dict[str, str | None]]: | |
| if mode not in ServerMode: | |
| raise ValueError(f'invalid mode="{mode}" allowed modes: {", ".join(list(ServerMode))}') | |
| return deepcopy(_SERVERS_REGISTRY[mode]) | |
| def _get_servers(mode: ServerMode | str) -> dict[APIVersion, dict[str, str | None]]: | |
| if isinstance(mode, str): | |
| try: | |
| mode = ServerMode(mode) | |
| except ValueError as exc: | |
| allowed = ", ".join(m.value for m in ServerMode) | |
| raise ValueError(f'invalid mode="{mode}" allowed modes: {allowed}') from exc | |
| elif mode not in ServerMode: | |
| allowed = ", ".join(m.value for m in ServerMode) | |
| raise ValueError(f'invalid mode="{mode}" allowed modes: {allowed}') | |
| return deepcopy(_SERVERS_REGISTRY[cast(ServerMode, mode)]) |
| return self.servers[self.api_version]["apikey"] | ||
|
|
||
| @apikey.setter | ||
| def apikey(self, value: str | None) -> None: |
There was a problem hiding this comment.
The apikey setter no longer enforces the previous type restriction (string or None). Since this is part of the public configuration surface, it’s easy for callers to accidentally set non-strings (e.g., dict/int) and only fail later during request building. Consider restoring a simple type check in the setter to fail fast with a clear TypeError.
| def apikey(self, value: str | None) -> None: | |
| def apikey(self, value: str | None) -> None: | |
| if value is not None and not isinstance(value, str): | |
| raise TypeError( | |
| f"apikey must be a string or None, got {type(value).__name__}: {value!r}" | |
| ) |
| openml.config.server = "temp-server1" | ||
| openml.config.apikey = "temp-apikey1" | ||
| openml.config._get_servers(mode)["server"] = 'temp-server2' | ||
| openml.config._get_servers(mode)["apikey"] = 'temp-server2' |
There was a problem hiding this comment.
This test is mutating the dict returned by _get_servers(mode) using keys "server"/"apikey", but _get_servers returns a mapping keyed by APIVersion, so these assignments will raise KeyError. If the intent is to verify the returned value is a deep copy, mutate a nested entry (e.g., servers_copy[api_version]["server"] = ...) and then assert a fresh _get_servers(mode) call still equals orig_servers.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 12 changed files in this pull request and generated 5 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| ------ | ||
| OpenMLNotSupportedError | ||
| V2 server does not yet support POST /runs/ endpoint. | ||
| Expected availability: Q2 2025 |
There was a problem hiding this comment.
The docstring claims an "Expected availability: Q2 2025" for the V2 publish endpoint, which is now in the past and likely to become increasingly misleading. Consider removing the date or replacing it with a timeless note (e.g., link to an issue/roadmap entry) so the message stays accurate over time.
| Expected availability: Q2 2025 |
| import time | ||
| import unittest | ||
| import warnings | ||
| from urllib.parse import urljoin |
There was a problem hiding this comment.
urljoin is imported but not used anywhere in this test module. Please remove the import (or use it) to avoid lint/test failures in environments enforcing unused-import checks.
| from urllib.parse import urljoin |
| @mock.patch.object(requests.Session, "request") | ||
| def test_delete_run_not_owned(mock_delete, test_files_directory, test_server_v1, test_apikey_v1): | ||
| content_file = test_files_directory / "mock_responses" / "runs" / "run_delete_not_owned.xml" | ||
| mock_delete.return_value = create_request_response( |
There was a problem hiding this comment.
The patched method is requests.Session.request, but the mock parameter is still named mock_delete, which is misleading. Rename the argument to something like mock_request (and update local variable names) to reflect what’s being mocked.
| ignore_cache : bool | ||
| Whether to ignore the cache. If ``true`` this will download and overwrite the run xml | ||
| even if the requested run is already cached. | ||
|
|
There was a problem hiding this comment.
The get_run docstring repeats the ignore_cache parameter section twice. Please remove the duplicate block so the docstring stays accurate and easier to maintain.
| ignore_cache : bool | |
| Whether to ignore the cache. If ``true`` this will download and overwrite the run xml | |
| even if the requested run is already cached. |
| f'"http://openml.org/openml": {runs_dict}', | ||
| ) | ||
|
|
||
| assert isinstance(runs_dict["oml:runs"]["oml:run"], list), type(runs_dict["oml:runs"]) |
There was a problem hiding this comment.
assert isinstance(..., list) is used for validating the parsed XML shape. Asserts can be disabled with Python optimizations (-O), which would skip this check and potentially lead to harder-to-debug errors later. Prefer raising a TypeError/ValueError (similar to openml.runs.functions.__list_runs) instead of using assert for input validation.
| assert isinstance(runs_dict["oml:runs"]["oml:run"], list), type(runs_dict["oml:runs"]) | |
| if not isinstance(runs_dict["oml:runs"]["oml:run"], list): | |
| raise TypeError( | |
| "Error in return XML, 'oml:runs/oml:run' is expected to be a list, " | |
| f"but got {type(runs_dict['oml:runs']['oml:run']).__name__}: " | |
| f"{runs_dict['oml:runs']['oml:run']}" | |
| ) |
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 10 out of 12 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| assert isinstance(runs_dict["oml:runs"]["oml:run"], list), type(runs_dict["oml:runs"]) | ||
|
|
There was a problem hiding this comment.
_parse_list_xml uses a bare assert to validate that oml:run is a list. Assertions can be stripped with python -O, and even when enabled they raise AssertionError (less actionable for API consumers). Prefer an explicit isinstance(..., list) check that raises TypeError/ValueError with a clear message (similar to openml.runs.functions.__list_runs).
| try: | ||
| run_v1.get(run_id=TEST_RUN_ID) | ||
| return TEST_RUN_ID | ||
| except Exception: | ||
| runs_df = run_v1.list(limit=1, offset=0) | ||
| if runs_df.empty: | ||
| pytest.skip("No runs available on configured test server") | ||
| return int(runs_df.iloc[0]["run_id"]) |
There was a problem hiding this comment.
_get_any_run_id catches a broad Exception, which can hide real test failures (e.g., network issues, parsing errors) and make debugging harder. It would be more robust to only fall back to list() for the expected failure modes (e.g., an OpenML server “not found”/HTTP error for that specific run_id) and let unexpected exceptions fail the test.
geetu040
left a comment
There was a problem hiding this comment.
Please see #1619 (review) regarding changed cache path
Metadata
Details
fixes #1624