Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid network/index related imports for commands that won't need 'em #12566

Merged
merged 9 commits into from
Apr 9, 2024

Conversation

ichard26
Copy link
Member

Towards #4768.

These imports are slow and should be avoided as much as possible. For example, this patch brought down the execution time of pip cache dir from 200ms to 160ms locally.

pip list is a bit special as it's currently an IndexGroupCommand and performs a pip version self-check. Speeding pip list can be done when #11677 is addressed.

Running `pip show packaging` now results in 420 entries in sys.modules
instead of 607.
This notably helps pip freeze which doesn't need the network unless a
URL is given as a requirement file.
So freeze doesn't need to import the index/network modules.
... to avoid importing all of the network/index related modules.
@pytest.mark.parametrize(
"command", ["cache", "check", "config", "freeze", "hash", "help", "inspect", "show"]
)
def test_no_network_imports(command: str) -> None:
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is admittedly janky, I would appreciate ideas for preventing regressions that are cleaner :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more robust approach that doesn't require using the internals would be to use runpy. You could do something like:

import runpy
import sys

sys.argv[1:] = [{command}, "--help"]

try:
    runpy.run_module("pip", alter_sys=True, run_name="__main__")
finally:
    with open({file!r}, "w") as f:
        print(list(sys.modules.keys()), file=f)

and then have file and command templated in as f-strings (file being under tmp_path fixture's folder would be cleanest, I think).

Also this test creates a bunch of subprocesses, and should live under integration tests for that reason.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, that's much cleaner. Thanks a ton!

tests/unit/test_commands.py Outdated Show resolved Hide resolved
@pytest.mark.parametrize(
"command", ["cache", "check", "config", "freeze", "hash", "help", "inspect", "show"]
)
def test_no_network_imports(command: str) -> None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A more robust approach that doesn't require using the internals would be to use runpy. You could do something like:

import runpy
import sys

sys.argv[1:] = [{command}, "--help"]

try:
    runpy.run_module("pip", alter_sys=True, run_name="__main__")
finally:
    with open({file!r}, "w") as f:
        print(list(sys.modules.keys()), file=f)

and then have file and command templated in as f-strings (file being under tmp_path fixture's folder would be cleanest, I think).

Also this test creates a bunch of subprocesses, and should live under integration tests for that reason.

@ichard26
Copy link
Member Author

ichard26 commented Apr 8, 2024

Thanks @pradyunsg for the review! You'll also like to hear that I filed a PR updating tenacity to lazy import asyncio: jd/tenacity#450. This reduces import overhead by 10-15 more milliseconds.

Rewriting the test to take the difference from the command registry has revealed that pip uninstall also imports the network modules. AFAIU, this is for situations where a remote requirement file is given (-r <url>) which is certainly a rare usecase. I can adjust the uninstall command to avoid importing the network modules unless truly necessary, but that'll have to wait for when I have more free time.

tests/functional/test_cli.py Outdated Show resolved Hide resolved
Copy link
Member

@uranusjr uranusjr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This kind of shows some functions are likely out of place and could use some refactoring. The PR is good though.

@pradyunsg pradyunsg merged commit 550388e into pypa:main Apr 9, 2024
24 checks passed
@ichard26 ichard26 deleted the lazy-imports branch April 17, 2024 22:31
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 3, 2024
@ichard26 ichard26 added the type: performance Commands take too long to run label Dec 26, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants