Skip to content

Reduce memory consumption in FilesCollection#1219

Open
ckuran wants to merge 1 commit into
phpro:v2.xfrom
ckuran:reduce-memory-consumption
Open

Reduce memory consumption in FilesCollection#1219
ckuran wants to merge 1 commit into
phpro:v2.xfrom
ckuran:reduce-memory-consumption

Conversation

@ckuran
Copy link
Copy Markdown

@ckuran ckuran commented May 15, 2026

Q A
Branch 2.x
Bug fix? yes
New feature? no
BC breaks? no
Deprecations? no
Documented? n/a
Fixed tickets #1218

This PR reduces memory consumption in FilesCollection. The key changes:

  1. Lower memory in parallel mode. When GrumPHP runs tasks in parallel via amphp/parallel, the file collection is serialized and shipped to worker processes. Previously each file was immediately unpacked into a full SymfonySplFileInfo object levereging a lot of memory. Now the worker receives plain strings and only builds the objects when something actually iterates over them
  2. ensureFiles() - appends one file collection to another without duplicates. It now keeps known paths in a plain hash-map array, so duplicate checks are constant across time.
  3. Added printing Peak memory usage: X.X MB at the end of commands for easier diagnostic (only in verbose mode)

Results

Before

Running tasks with priority 0!
==============================

Running task 1/12: jsonlint... ✔
Running task 2/12: xmllint... ✔
Running task 3/12: yamllint... ✔
Running task 4/12: openapi... ✔
Running task 5/12: rector... ✔
Running task 6/12: behat... ✔
Running task 7/12: phpmd... ✔
Running task 8/12: phpcs... ✔
Running task 9/12: phpstan... ✔
Running task 10/12: psalm... ✔
Running task 11/12: test-unit... ✔
Running task 12/12: test-integration... ✔

Peak memory usage: 1152.0 MB

After

Running tasks with priority 0!
==============================

Running task 1/12: jsonlint... ✔
Running task 2/12: xmllint... ✔
Running task 3/12: yamllint... ✔
Running task 4/12: openapi... ✔
Running task 5/12: rector... ✔
Running task 6/12: behat... ✔
Running task 7/12: phpmd... ✔
Running task 8/12: phpcs... ✔
Running task 9/12: phpstan... ✔
Running task 10/12: psalm... ✔
Running task 11/12: test-unit... ✔
Running task 12/12: test-integration... ✔

Peak memory usage: 126.0 MB

@ckuran ckuran force-pushed the reduce-memory-consumption branch from b506c1d to 686c3cd Compare May 21, 2026 08:35
foreach ($files as $file) {
$this->add(new SymfonySplFileInfo($file, dirname($file), $file));
foreach ($data as $path) {
$this->add($path);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The signature however is ArrayCollection<array-key, \SplFileInfo>.
Now it also contains strings making it result in broken public API contracts (see e.g; the getFirst() vs first() and optionally similar methods like last, current, ...)

I'm also wondering: now every time an iterator goes over it, it will be hydrated.
So if you have more tasks (or file filters), it will always increase memory.

This makes me wonder: what is the ACTUAL memory issue that we are trying to solve?
Can we find a better solution than keeping it as strings intermediate?

It also has been a while but: what benifit does SymfonySplFileInfo give over regular SplFileInfo here. Which one does add the overhead?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants