Skip to content

Dpbm/video-python-import-methods

Repository files navigation

Notes

C includes

headers (.h files)

  • includes the interfaces for external code. including a header into a c source file is the same as copying the definition of the function or whatever to that context, but less prone to errors.

#include

  • it reads a file and places it inside the current source code.
  • to avoid errors, use it to only import functions/declarations/definitions/etc from header files.
  • includes using <> looks only in standard system directories. But without it, it first looks into the current path and then starts look for it in other system locations.

Wrappers

  • to avoid import the same thing twice use the wrappers
#ifndef DEFINITION_OF_HEADER
#define DEFINITION_OF_HEADER

...

#endif 
  • you can also use #pragma once

JS

Common JS

  • use require and exports.
  • require can load ES modules when the file is .mjs

ES modules

  • are more flexible
  • you can define queries, properties, etc.

Python import

  • Python uses importlib as the underlying machinary for importing.
  • components of import are available through importlib to make it possible for you to create a custom object.
  • import calls __import__().
  • importer is a term that refers to a both finds and loads a module.
  • the import command don't put every single definition directly into the current scope, but only the module name. To access its internals, you must use module_name.<what you want>.
  • is not required to be on the top of the file.
  • it search, create a module object and binds a name, while __import__ only searches and creates the object.
  • when module is not found it raises ModuleNotFoundError.

The import statement

  1. find a module and loads it.
  2. define a name for this module in the current scope.

The import statement using from

  1. find the module specified in from clause.
  2. for each name in import check if the name exists. If not, check if it's a submodule. If not found raise ImportError. Otherwise import to the local namespace.

using *

  • it exposes all public data.
  • members with _ are not exposed by default.
  • if __all__ is defined, all names defined are exposed. Otherwise all public members.
  • __all__ is a global variable per module.
  • it imports all names into the current scope (except those which start with _).
  • not recommended.

Relative imports

  • only works on packages contained one within the another.
  • . for the current package and .. for one that's a level up.

PYTHONPATH

  • add entries in sys.path.
  • useful when we need to test a package without installing it.
  • affects installed python versions/environments, only use when it's really needed.

sys.path

  • first entry is the directory of the current script or the current directory when using -c or -m.

sys.modules

  • a dictionary that maps modules that were already loaded.
  • it's a cache.
  • return a module ready to run.
  • maps submodules as well.
  • if an entry value is None python will raise an ModuleNotFoundError.

sys.meta_path (meta hooks)

  • Are called before any importing proccess so it can override the sys.path, frozen modules and even builtin ones.
  • To register add a finder object in sys.meta_path
  • if the whole list is check and every finder returns None, so the module cannot be handled, raising an ModuleNotFoundError, otherwise return a spec.
  • it's called multiple times for multilevel packages (foo.bar.baz, 3 times, 'foo', 'foo.bar', 'foo.bar.baz').
  • python has 3 by default. For builtin modules, for frozen modules and for an import path.
  • by default it can handle .py, .pyc and .so files.

sys.path_hooks

  • part of sys.path processing.
  • is registered by adding an importer factory into sys.path_hooks.
  • it checks if a path item can be handled.
  • return an importer when an item can be handled, otherwise return ImportError.
  • is consulted while traversing a package __path__.
  • python uses this list to handle different types of files and locations, like from URLs, zip files, so files, .py files, etc.
  • it's used after python tries every single finder from sys.meta_path. If none works, it iterate over each sys.path_hooks entries and give it each possible value from sys.path.
  • must call sys.path_importer_cache.clear() after adding a new factory to it

Modules

  • is a file containing statements and definitions.
  • the module name is the filename, which can be accessed via __name__.
  • can have executable statements that are executed only once when the module is imported.
  • have its private namespace.
  • when executed as a script its name (__name__) becomes __main__.
  • when importing a module, it searches in this sequence:
    1. sys.builtin_module_names.
    2. sys.path (which includes the directory of the file, PYTHONPATH and the installation dependent directory site-packages).
  • the dir() command is used to list all the names in a module
  • builtins is a module that list everything that's builtin python
  • every module has a repr (representaion) depending of data like name, origin, etc.

__pycache__

  • contains the cached modules that were previously built.
  • python always ignore the cache when:
    1. they are directly loaded via CLI (it recompiles the module).
    2. when the module is a binary.
  • to reduce the compiled size use:
    1. -O to remove assert statements
    2. -OO to remove asserts and __doc__
    3. opt- to optimize the binary
  • .pyc are only faster for loading not executing

Packages

  • collection of modules
  • inside the package, when importing, python searchs in sys.path for the names. To find the submodules inside your package, each submodule must have a __init__.py file.
  • when loading submodules with import python tries to find the name as a declarion/statement from the module, if it's not find it tries to load as a submodule. In case it's not found, it will raise and ImportError.
  • when using import * it only import the names defined for that package, not every submodule name.
  • you can use relative imports within the submodules to navigate quickly between code.
  • you can also use 'path' to check which directory python search for the submodules.
  • Any module that has a __path__ is a package.

Regular Packages

  • directories with __init__.py.

Namespace Packages

  • split a pacakge between multiple locations on disk.
  • cannot contain __init__.py files.
  • __path__ is read-only.
  • at runtime you can use sys.path to discover them.

__init__.py

  • can be empty but can also run initialization code.
  • when a module is imported, the __init__.py is executed.
  • define a package.

Import Protocol

  • it has two parts
  • when both are implement it's called an importer.
  • finder and loader can be the same object.

finders

  • it determines if a module can be found.
  • return a module spec with all information needed for loading it.
  • can't point to whatever location, not requering to be in the local machine.
  • can be a meta_path finder, which operates at the beggining of the importing proccess, and path_entry finder, which responsible for finding and loading modules and packages located via a string path entry.
  • sys.path_importer_cache maintains a cache of finders objects.

loaders

  • don't need to check for sys.modules, import will check it before.
  • module execution.

Overall Execution pipeline (Default pipeline as I understood)

  • you use import
  • it uses under the hood __import__()
  • search the module at cache sys.modules
  • if not found, run the meta hooks
  • run the finder iteratively
    • if no meta_path worked, it iterates over sys.path and sys.path_hooks trying to see if any of the paths can be handled
  • check if .pyc file is up-to-date, when importing python files
    • if not, it regenerates the file and save the new hash
  • run loader returning a module object
  • binds the module object into a name provided via import

References