Skip to content

SNOW-3315838: Honor dockerignore file if present#2845

Open
sfc-gh-svasudevan wants to merge 4 commits intomainfrom
svasudevan-add-dockerignore-honoring
Open

SNOW-3315838: Honor dockerignore file if present#2845
sfc-gh-svasudevan wants to merge 4 commits intomainfrom
svasudevan-add-dockerignore-honoring

Conversation

@sfc-gh-svasudevan
Copy link
Copy Markdown
Contributor

@sfc-gh-svasudevan sfc-gh-svasudevan commented Mar 31, 2026

When a .dockerignore file is present in the build context, the CLI now filters out matching files before uploading to the stage -- matching the behavior of docker build. The .dockerignore file itself is still uploaded so BuildKit can reference it during the image build. If no .dockerignore is present, all files are uploaded as before.

Pre-review checklist

  • I've confirmed that instructions included in README.md are still correct after my changes in the codebase.
  • I've added or updated automated unit tests to verify correctness of my new code.
  • I've added or updated integration tests to verify correctness of my new code.
  • I've confirmed that my changes are working by executing CLI's commands manually on MacOS.
  • I've confirmed that my changes are working by executing CLI's commands manually on Windows.
  • I've confirmed that my changes are up-to-date with the target branch.
  • I've described my changes in the release notes.
  • I've described my changes in the section below.
  • I've described my changes in the documentation.

Changes description

...

@sfc-gh-svasudevan sfc-gh-svasudevan changed the title Svasudevan add dockerignore honoring SNOW-3315838: Honor dockerignore file if present Mar 31, 2026
@sfc-gh-svasudevan sfc-gh-svasudevan marked this pull request as ready for review March 31, 2026 21:09
@sfc-gh-svasudevan sfc-gh-svasudevan requested a review from a team as a code owner March 31, 2026 21:09
return SingleQueryResult(cursor)


def _load_dockerignore_patterns(build_context_dir: Path) -> List[str]:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put into in a separate module?

Description

Testing
for pattern in patterns:
if fnmatch.fnmatch(rel_path_posix, pattern):
return True
if fnmatch.fnmatch(rel_path_posix, pattern + "/**"):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

**/ prefix patterns fail at root level

is_ignored('node_modules', ['**/node_modules']) returns False. Common patterns like **/node_modules, **/__pycache__, and **/.git silently fail to match root-level entries because fnmatch requires a literal / before the segment name.

Fix: when a pattern starts with **/, strip the prefix and retry:

if pattern.startswith("**/"):
    if fnmatch.fnmatch(rel_path_posix, pattern[3:]):
        return True
    if fnmatch.fnmatch(rel_path_posix, pattern[3:] + "/**"):
        return True

line = line.strip()
if not line or line.startswith("#"):
continue
patterns.append(line)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Negation patterns (!) are silently ignored

Docker's .dockerignore supports !pattern to re-include previously excluded files (e.g., *.md then !README.md). Currently !README.md is passed through as a literal pattern — fnmatch.fnmatch('README.md', '!README.md') returns False, so the negation is a silent no-op and those files get incorrectly excluded.

At minimum, emit a warning when a !-prefixed line is encountered:

if line.startswith("!"):
    cli_console.warning(f"Negation patterns in .dockerignore are not yet supported, skipping: {line}")
    continue

similar to how .dockerignore works.
"""
rel_path_posix = rel_path.replace(os.sep, "/")
for pattern in patterns:
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leading slash patterns (/Dockerfile) silently do nothing

Docker uses a leading / to anchor a pattern to the build context root (e.g., /temp matches only temp at the root, not a/temp). fnmatch.fnmatch('temp', '/temp') returns False, so these patterns silently match nothing.

Fix: strip a leading / from each pattern before matching, which gives root-only semantics since os.walk paths are relative (no leading /).

"""
rel_path_posix = rel_path.replace(os.sep, "/")
for pattern in patterns:
if fnmatch.fnmatch(rel_path_posix, pattern):
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trailing slash directory patterns (logs/) silently do nothing

Docker treats logs/ as a directory-only match. fnmatch.fnmatch('logs/app.log', 'logs/') returns False and no file path ever equals logs/, so these patterns silently match nothing.

Fix: strip a trailing / from each pattern before matching. Since os.walk separates dirs and files naturally, stripping the slash gives the correct semantics.

if not dockerignore_path.is_file():
return []
patterns = []
for line in dockerignore_path.read_text().splitlines():
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

read_text() missing encoding="utf-8"

On Windows with a non-UTF-8 system encoding this may fail or misread the file. Use read_text(encoding="utf-8") for consistent cross-platform behavior.

):
pass
else:
for _ in stage_manager.put_recursive(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

else branch (no .dockerignore) is untested

Both CLI-level tests use build contexts that include a .dockerignore, so this path — where files go directly through put_recursive without filtering — has zero test coverage. This is the default path for most users.

Please add a test that invokes build-image with a build context that has no .dockerignore and verifies files are uploaded.

ignore_patterns = load_dockerignore_patterns(build_context_dir)
if ignore_patterns:
cli_console.message(
f"Using .dockerignore ({len(ignore_patterns)} pattern(s))"
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider logging excluded files at debug level

If a build fails because an important file was accidentally excluded by .dockerignore, the user has no diagnostic output to trace the issue. Logging excluded files (or a count) at a debug/verbose level would make troubleshooting much easier.

with TemporaryDirectory() as tmp_dir:
filtered_dir = Path(tmp_dir)
copy_filtered_build_context(
build_context_dir, filtered_dir, ignore_patterns
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider warning when .dockerignore excludes all non-Dockerfile files

If the patterns are overly broad and exclude everything except the Dockerfile, the build will fail downstream with a confusing error. A simple check after copy_filtered_build_context — warning if only the Dockerfile remains in the filtered directory — would save users significant debugging time.

assert "log 2" in generated_logs[1]


TEST_BUILD_IMAGE_DIRECTORIES = (
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test coupled to integration fixture

This path points into tests_integration/, so someone modifying that fixture (adding/removing files, changing .dockerignore content) could silently break this unit test. The test also inadvertently exercises the dockerignore-filtered path (that directory has a .dockerignore), despite the name suggesting a plain recursive upload.

Consider moving the fixture to tests/test_data/projects/spcs_build_context_nested/ (without a .dockerignore, so it actually tests the non-filtered path) and using the existing project_directory fixture:

def test_build_image_cli_recursive_upload_with_nested_dirs(
    ..., project_directory
):
    with project_directory("spcs_build_context_nested") as build_context:
        ...

This follows the convention used throughout test_services.py and avoids the cross-module coupling.


put_calls = [
c
for c in mock_stage_execute_query.call_args_list
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicated mock boilerplate and fragile PUT SQL parsing

The ServiceManager/ObjectManager/SnowflakeCursor mock setup is copy-pasted verbatim between this test and test_build_image_cli_dockerignore_excludes_files. Consider extracting it into a shared @pytest.fixture.

Additionally, filtering calls via str(c).startswith("call('put ") and splitting on "@" depends on mock __repr__ format and exact SQL structure. Prefer inspecting call_args kwargs directly, which is the pattern used elsewhere in this test file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants