Enforce strict mypy type checking and additional ruff rules#492
Enforce strict mypy type checking and additional ruff rules#492FabianHofmann wants to merge 24 commits intomasterfrom
Conversation
…nd removing unnecessary type: ignore comments
…tions, remove stray code
…all 61 violations
…te_time helper, consistent keep_attrs, use fixtures in tests
for more information, see https://pre-commit.ci
…ypes across codebase - Add NumPy-style docstrings to all functions in datasets/era5.py and improve convert.py docstrings - Define shared Literal type aliases (TrackingType, ClearskyModel, TrigonModel, etc.) in _types.py - Update ~30 function signatures across convert.py, pv/, and data.py to use Literal types - Fix CSPConfig TypedDict missing 'technology' key and widen orientation/panel param types
Per-file ignores for datasets modules still lacking docstrings (cordex, ncep, sarah, gebco).
…d remove per-file D103 ignores
for more information, see https://pre-commit.ci
lkstrp
left a comment
There was a problem hiding this comment.
Tons of good cleanup changes, thank youuu ! Just a bit too much and mixed which makes it a bit hard to review, since this now includes changed docstrings, ruff, types and tests.
I think some things need another look. I only went through parts for it now:
- There are many comments removed, and sometimes it looks like Claude chucked them, but some might be removed intentionally? I am not always sure and I would wanna keep hand crafted code explaining comments in general (if useful and up to date). The same goes for docstrings sometimes. They are rewritten, often improved, but sometimes useful old information is removed
- In
_types.py, many types are defined which are then used everywhere. This makes sense for new literals and more complicated types, but some also just define an alias for existing types from the library. I suppose this is also a style question, but since I believe that simplicity is becoming increasingly important, I would try to simplify it. If an argument requires many different types, this is also an indication that the signature should be restructured. Abstracting all those types away just hides that
| if TYPE_CHECKING: | ||
| from types import ModuleType | ||
|
|
||
| modules: dict[str, ModuleType] = {"era5": era5, "sarah": sarah, "gebco": gebco} |
| ds = ds.sel(x=xs, y=ys) | ||
|
|
||
| if newname in {"influx", "outflux"}: | ||
| # shift averaged data to beginning of bin |
There was a problem hiding this comment.
I just don't like comments which are describing the code
| ) | ||
| ) | ||
| elif newname in {"runoff"}: | ||
| # shift and fill 6hr average data to beginning of 3hr bins |
| config.cordex_dir, # noqa: F821 | ||
| weather_data_config: dict[str, dict[str, Any]] = {} | ||
| try: | ||
| from atlite import config # type: ignore[attr-defined] |
There was a problem hiding this comment.
That should never be an import error is it?
| "prepare_func": prepare_weather_types_cordex, | ||
| "oldname": "CWT", | ||
| "newname": "CWT", | ||
| "template": os.path.join( # noqa: PTH118 |
There was a problem hiding this comment.
I would rather ignore PTH118 globally for now, so we can add it easily later again. Using pathlib instead of os is very recommended
| ) | ||
| elif clearsky_model == "enhanced": | ||
| # Enhanced Reindl model with ambient air temperature and relative | ||
| # humidity |
| influx = direct + diffuse | ||
|
|
||
| with np.errstate(divide="ignore", invalid="ignore"): | ||
| # brightening factor |
| NDArrayInt: TypeAlias = np.ndarray[Any, np.dtype[np.signedinteger[Any]]] | ||
| NDArrayBool: TypeAlias = np.ndarray[Any, np.dtype[np.bool_]] | ||
| DataArray: TypeAlias = xr.DataArray | ||
| Dataset: TypeAlias = xr.Dataset |
There was a problem hiding this comment.
Do we need all those? Some are just aliases. I think it's much cleaner to just keep for example use xr.Dataset than redefining it?
| SparseMatrix: TypeAlias = sp.lil_matrix | sp.csr_matrix | ||
|
|
||
| TrackingType: TypeAlias = ( | ||
| Literal["horizontal", "tilted_horizontal", "vertical", "dual"] | None |
There was a problem hiding this comment.
I like to define those literals! But I would not put | None in there in general. That information if something is optional should be in the signature. And again, I am not so sure if this needs to be defined in an extra Type? How often is this actually used in the signature? If I wanna see this in the function I need to click once more now. Guess for some it makes sense, but you think for all of them?
| DataFormat: TypeAlias = Literal["grib", "netcdf"] | ||
|
|
||
|
|
||
| class ERA5RetrievalParams(TypedDict, total=False): |
There was a problem hiding this comment.
Is it wise to hardcode that here? Since it's a dict it doesn't give you any benefits, is it?
Changes proposed in this Pull Request
This PR enforces strict static type checking with mypy and enables additional ruff lint rules across the entire codebase (including
test/anddoc/).Docstrings
Mypy
pyproject.toml:disallow_incomplete_defs = true,check_untyped_defs = trueatlite/,test/, anddoc/.pre-commit-config.yamlto matchatlite/_types.pytype aliases and addedTypeAliasannotations where needed (e.g.csp.py,utils.py)convert.py)__all__in__init__.pyto use strings instead of object referencesRuff
Enabled 9 new rule sets and fixed all violations (~260 fixes):
from err/from Noneto re-raised exceptions, fixed mutable defaultsifstatements, simplified dict operationselseafterreturn/raisedict()/list()constructorsTYPE_CHECKINGblocksnp.random.seed/np.random.randwithnp.random.default_rng%formattingos.path.*andopen()withpathlibequivalents# noqacommentsTODO
cast()where possible, need bump python version for mypyChecklist
doc.environment.yaml,environment_docs.yamlandsetup.py(if applicable).doc/release_notes.rstof the upcoming release is included.