Skip to content

Conversation

@FFY00
Copy link
Member

@FFY00 FFY00 commented Oct 10, 2025

@FFY00
Copy link
Member Author

FFY00 commented Oct 10, 2025

Still needs tests, but I'll wait to see the feedback on the issue.

…r.discover

Signed-off-by: Filipe Laíns <lains@riseup.net>
@FFY00
Copy link
Member Author

FFY00 commented Dec 10, 2025

I've updated the method to take a module spec, instead of a module object, as in some cases the parent might have failed to import.

In a follow-up, I will add a protocol for finders implementing .discover(), as @brettcannon suggested.

Signed-off-by: Filipe Laíns <lains@riseup.net>
Signed-off-by: Filipe Laíns <lains@riseup.net>
Signed-off-by: Filipe Laíns <lains@riseup.net>
This reverts commit 31d1a8f.

Signed-off-by: Filipe Laíns <lains@riseup.net>
Signed-off-by: Filipe Laíns <lains@riseup.net>
if parent is None:
path = sys.path
else:
path = parent.submodule_search_locations
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this can be None when parent is a non-package module? make a nicer error message or should this situation use sys.path?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How can it be a non-package module if it has a child? It could be a namespace package, but that's still a package, and parent.submodule_search_locations should be an iterable objects when the spec is fully initialized, which it should always be at this point.

Unless I am missing something? Do we support package-like module extensions?

return path_hook_for_FileFinder

def _find_children(self):
for entry in _os.scandir(self.path):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

use a with statement to explicitly close the iterator - https://docs.python.org/3/library/os.html#os.scandir.close

also consider handling exceptions similar to what _fill_cache does, which is also be needed around the is_dir and is_file bits within the loop below - https://docs.python.org/3/library/os.html#os.scandir.close

document exception handling semantics of discover. I doubt callers ever expect to need be prepared for those?

return path_hook_for_FileFinder

def _find_children(self):
for entry in _os.scandir(self.path):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should this be cached or not? we should document the cache interaction behavior either way regardless.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think so? Because it could change at runtime, no?

# files
if entry.is_file():
yield from [
entry.name.removesuffix(suffix)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

entry.name could exist with multiple loader suffixes on the filesystem. dedupe to avoid redundant specs?

module spec. If *parent* is *None*, :meth:`MetaPathFinder.discover` will
search for top-level modules.

Returns an iterable of possible specs.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would this be the first time we have a public API yielding things from importlib, do we actually want to do that or should this return a list?

callers consuming the results might be writing code that makes changes that could impact future results... yielding could get messy.

what are the intended use cases for the API? if it's something we expect callers to short circuit and stop iterating on after the first match maybe yield makes sense, but then we should probably just have an explicit direct discover_first API for that instead.

Copy link
Member Author

@FFY00 FFY00 Dec 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is something I considered.

The main use-case is finding a similar-named module to show as a hint on ModuleNotFoundError (eg. "Did you meant numpy?", when trying to import numby), so I think it would make sense to make this new API a generator, or at least some kind of lazy container.

For cases such as you describe, the user could just consume the full generator into a list to avoid any issue. Still leaving opportunity for the code that could leverage the benefit of this being a lazy API — scaning directories with a lot of files can take a while, not to mention the other exotic finders out there that may operate over the network or something like that.

I am not fundamentally opposed to make this method return a list, but I can't see the value in the trade-of if we document it properly.

callers consuming the results might be writing code that makes changes that could impact future results... yielding could get messy.

While this is technically possible, I would find it extremely uncommon. And I think it should be reasonable to assume that people who are knowledgeable enough to do that, would probably be aware of the downsides of making changes to the import machinery, while consuming the API

but then we should probably just have an explicit direct discover_first API for that instead

And what would that look like? Would it take a predicate function and return the first entry that matches?


So, would have a warning in the documentation regarding your concern be a good enough compromise?

else:
path = parent.submodule_search_locations

for entry in path:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

path can contain duplicate entries. should we dedupe?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, sure 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants