Skip to content

Conversation

@radoering
Copy link
Member

@radoering radoering commented Dec 11, 2025

When working on a project with many developers, unnecessary diffs in the lockfile due to outdated caches of some developers are an annoying issue. Outdated caches are normally the result of postponed artifact uploads. This change detects such diffs and refreshes the cache automatically.

Pull Request Check List

  • Added tests for changed code.
  • Updated documentation for changed code.

Summary by Sourcery

Automatically refresh cached package metadata when lockfile files differ from repository cache to avoid unnecessary lockfile diffs.

New Features:

  • Add repository pool refresh mechanism that invalidates cached package metadata and reloads it from the underlying repository when needed.

Bug Fixes:

  • Ensure packages completed from the pool are refreshed when their cached file list is outdated compared to the locked package, preventing spurious lockfile changes.

Enhancements:

  • Track refreshed packages in the provider to avoid repeated cache refreshes for the same package version and source.
  • Expose a forget operation on cached repositories to clear cached release info entries by name and version.

Tests:

  • Add provider tests covering cache refresh behavior when package files in the lock differ from cached repository data.
  • Extend lock and add command tests to account for cache refresh logic and ensure no unnecessary updates occur.
  • Add cached repository tests verifying release info caching, cache disabling, and behavior after clearing cached entries.

… in cache

When working on a project with many developers unnecessary diffs in the lockfile due to outdated caches of some developers are an annoying issue. Outdated caches are normally the result of postponed artifact uploads. This change detects such diffs and refreshes the cache automatically.
@sourcery-ai
Copy link

sourcery-ai bot commented Dec 11, 2025

Reviewer's Guide

Adds automatic cache refresh for locked packages when their file lists differ from cached repository data, plus supporting repository APIs and tests to ensure outdated caches are corrected and not refreshed unnecessarily.

Sequence diagram for automatic cache refresh on package file mismatch

sequenceDiagram
    participant Provider
    participant RepositoryPool
    participant CachedRepository
    participant ReleaseCache as Release_cache

    Provider->>RepositoryPool: package(pretty_name, version, repository_name)
    RepositoryPool->>CachedRepository: package(name, version)
    CachedRepository->>ReleaseCache: get_release_info(canonical_name, version)
    ReleaseCache-->>CachedRepository: release_info
    CachedRepository-->>RepositoryPool: Package(pool_package)
    RepositoryPool-->>Provider: Package(pool_package)

    Provider->>Provider: compare package.files with pool_package.files
    alt file lists differ
        Provider->>RepositoryPool: refresh(pool_package)
        RepositoryPool->>RepositoryPool: determine repository_name
        RepositoryPool->>CachedRepository: forget(name, version)
        CachedRepository->>ReleaseCache: forget(canonical_name:version)
        ReleaseCache-->>CachedRepository: removed
        RepositoryPool->>CachedRepository: package(name, version)
        CachedRepository->>ReleaseCache: get_release_info(canonical_name, version)
        ReleaseCache-->>CachedRepository: fresh_release_info
        CachedRepository-->>RepositoryPool: Package(refreshed_package)
        RepositoryPool-->>Provider: Package(refreshed_package)
        Provider->>Provider: add key to _refreshed
    else file lists equal
        Provider->>Provider: keep pool_package
    end

    Provider->>Provider: create DependencyPackage and continue resolution
Loading

Class diagram for cache refresh of locked packages

classDiagram
    class Provider {
        +RepositoryPool _pool
        +Callable get_package_from_pool
        +set~tuple~ _refreshed
        +complete_package(package, dependency) DependencyPackage
    }

    class RepositoryPool {
        +list~Repository~ repositories
        +package(name, version, repository_name) Package
        +repository(name) Repository
        +search(query) list~Package~
        +refresh(package) Package
    }

    class CachedRepository {
        +dict _release_cache
        +package(name, version) Package
        +forget(name, version) void
    }

    class Package {
        +str pretty_name
        +Version version
        +list~dict~ files
        +str source_reference
        +str name
    }

    class DependencyPackage {
        +Dependency dependency
        +Package package
    }

    class Dependency {
        +str source_name
    }

    class Repository {
        <<interface>>
        +package(name, version) Package
        +search(query) list~Package~
    }

    Provider --> RepositoryPool : uses
    RepositoryPool --> Repository : aggregates
    Repository <|-- CachedRepository
    Provider --> DependencyPackage : creates
    DependencyPackage --> Package
    DependencyPackage --> Dependency
    RepositoryPool --> Package : refresh(package)
    CachedRepository --> Package : package(name, version)
    CachedRepository --> Package : forget(name, version)
Loading

File-Level Changes

Change Details Files
Provider now compares locked package files with repository package files and refreshes cached data once per package/source when they differ.
  • Introduced a per-provider _refreshed set keyed by (name, version, source_name) to track which packages have already triggered a refresh.
  • Adjusted complete_package to optionally bypass the cached pool accessor and use RepositoryPool.package directly when a refresh has already occurred for the package/source tuple.
  • Added logic in complete_package to compare sorted file lists between the locked package and the pool package, calling RepositoryPool.refresh and recording the refresh when they differ, and then building DependencyPackage from the refreshed package.
src/poetry/puzzle/provider.py
tests/puzzle/test_provider.py
RepositoryPool and CachedRepository gained APIs to invalidate cached release info and reload a package from its source repository.
  • Added RepositoryPool.refresh, which resolves the backing repository from a package's source_reference (defaulting to PyPI), calls forget on CachedRepository instances, and re-fetches the package.
  • Extended CachedRepository with a forget method that evicts a single release entry from its internal _release_cache.
  • Created tests for CachedRepository.get_release_info behavior with and without cache disabled, including cache invalidation via forget.
src/poetry/repositories/repository_pool.py
src/poetry/repositories/cached_repository.py
tests/repositories/test_cached_repository.py
Existing lock and add command tests were updated to ensure they do not spuriously trigger cache refresh by aligning in-memory package file lists with the lockfile.
  • Updated lock command tests to capture the package added to the repository and populate its files from the locked repository before executing the command.
  • Updated add command tests to ensure the locked docker package's files are synchronized with the locker's locked_repository when the locked fixture is used.
tests/console/commands/test_lock.py
tests/console/commands/test_add.py

Tips and commands

Interacting with Sourcery

  • Trigger a new review: Comment @sourcery-ai review on the pull request.
  • Continue discussions: Reply directly to Sourcery's review comments.
  • Generate a GitHub issue from a review comment: Ask Sourcery to create an
    issue from a review comment by replying to it. You can also reply to a
    review comment with @sourcery-ai issue to create an issue from it.
  • Generate a pull request title: Write @sourcery-ai anywhere in the pull
    request title to generate a title at any time. You can also comment
    @sourcery-ai title on the pull request to (re-)generate the title at any time.
  • Generate a pull request summary: Write @sourcery-ai summary anywhere in
    the pull request body to generate a PR summary at any time exactly where you
    want it. You can also comment @sourcery-ai summary on the pull request to
    (re-)generate the summary at any time.
  • Generate reviewer's guide: Comment @sourcery-ai guide on the pull
    request to (re-)generate the reviewer's guide at any time.
  • Resolve all Sourcery comments: Comment @sourcery-ai resolve on the
    pull request to resolve all Sourcery comments. Useful if you've already
    addressed all the comments and don't want to see them anymore.
  • Dismiss all Sourcery reviews: Comment @sourcery-ai dismiss on the pull
    request to dismiss all existing Sourcery reviews. Especially useful if you
    want to start fresh with a new review - don't forget to comment
    @sourcery-ai review to trigger a new review!

Customizing Your Experience

Access your dashboard to:

  • Enable or disable review features such as the Sourcery-generated pull request
    summary, the reviewer's guide, and others.
  • Change the review language.
  • Add, remove or edit custom review instructions.
  • Adjust other review settings.

Getting Help

Copy link

@sourcery-ai sourcery-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey there - I've reviewed your changes - here's some feedback:

  • The _refreshed set in Provider grows monotonically for the lifetime of the provider; consider a strategy to bound or periodically clear it (or key it more narrowly) to avoid unbounded memory growth on large or long-running resolves.
  • In RepositoryPool.refresh, the fallback to the literal "PyPI" when source_reference is missing relies on that exact repository name being present; it may be safer to reuse the same defaulting logic as other pool lookups or make the default repository explicit at construction time to avoid subtle mismatches.
  • There are two very similar MockCachedRepository test doubles (in tests/puzzle/test_provider.py and tests/repositories/test_cached_repository.py); consider extracting a shared helper or fixture to reduce duplication and keep behavior aligned if one evolves.
Prompt for AI Agents
Please address the comments from this code review:

## Overall Comments
- The `_refreshed` set in `Provider` grows monotonically for the lifetime of the provider; consider a strategy to bound or periodically clear it (or key it more narrowly) to avoid unbounded memory growth on large or long-running resolves.
- In `RepositoryPool.refresh`, the fallback to the literal "PyPI" when `source_reference` is missing relies on that exact repository name being present; it may be safer to reuse the same defaulting logic as other pool lookups or make the default repository explicit at construction time to avoid subtle mismatches.
- There are two very similar `MockCachedRepository` test doubles (in `tests/puzzle/test_provider.py` and `tests/repositories/test_cached_repository.py`); consider extracting a shared helper or fixture to reduce duplication and keep behavior aligned if one evolves.

Sourcery is free for open source - if you like our reviews please consider sharing them ✨
Help me be more useful! Please click 👍 or 👎 on each comment and I'll use the feedback to improve your reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant