Skip to content

fix: populate database_name from schema for ClickHouse (CORE-559)#976

Open
devin-ai-integration[bot] wants to merge 2 commits intomasterfrom
core-559-fix-null-database-name-clickhouse
Open

fix: populate database_name from schema for ClickHouse (CORE-559)#976
devin-ai-integration[bot] wants to merge 2 commits intomasterfrom
core-559-fix-null-database-name-clickhouse

Conversation

@devin-ai-integration
Copy link
Contributor

@devin-ai-integration devin-ai-integration bot commented Mar 25, 2026

fix: populate database_name from schema for ClickHouse (CORE-559)

Summary

ClickHouse does not have a separate database/schema concept — the dbt-clickhouse adapter sets node.database to None. This causes database_name to be NULL in all dbt artifact models (models, sources, seeds, tests, columns), which breaks the Cloud product.

ClickHouse's own information_schema assigns the same value to both the database and schema columns. This PR mirrors that behavior by introducing a dispatched get_node_database() macro:

  • default__get_node_database: returns node.get("database") (unchanged behavior for all other warehouses)
  • clickhouse__get_node_database: returns node.get("database") or node.get("schema") (falls back to schema when database is None)

All 7 call sites that extract a database name from a node dict are updated to use this macro. This follows the same dispatch pattern already used by clickhouse__get_package_database_and_schema.

Updates since last revision

  • Fixed get_relevant_databases.sql line 2: target.database is also None on ClickHouse, so the initial database list now uses target.database or target.schema to avoid a None entry. Note this is a direct or fallback (not routed through the dispatch macro) since target is not a node dict.

Review & Testing Checklist for Human

  • Metadata hash impact: database_name feeds into get_artifact_metadata_hash — ClickHouse users will see hash changes after deploy, triggering re-upload of all artifacts on next run. Verify this is acceptable.
  • target.database or target.schema consistency: Line 2 of get_relevant_databases.sql uses a generic or fallback rather than the ClickHouse-specific dispatch pattern. This is because target is not a node dict, but confirm this approach is acceptable given the preference for explicit ClickHouse-only logic.
  • Completeness check: Grep for any remaining node.get("database") or .get("database") usages in artifact-related macros that may have been missed.
  • End-to-end verification: Run against a ClickHouse target and verify database_name is populated (not NULL) in dbt_models, dbt_sources, dbt_seeds, dbt_tests, and dbt_columns tables.

Notes

Summary by CodeRabbit

  • Refactor
    • Improved database name resolution across dbt artifact collection and test handling for more consistent column/model/seed/source metadata and stable metadata hashes.
    • Added adapter-aware behavior to better support platforms that use schema as a database (improving compatibility with non-standard database configurations).

ClickHouse does not have a separate database/schema concept — the
dbt-clickhouse adapter sets node.database to None, causing NULL
database_name in dbt artifact models.

Add get_node_database() dispatched macro that falls back to schema
for ClickHouse only (mirroring ClickHouse's information_schema
behavior). Update all 7 call sites in artifact upload macros.

Co-Authored-By: unknown <>
@devin-ai-integration
Copy link
Contributor Author

🤖 Devin AI Engineer

I'll be helping with this pull request! Here's what you should know:

✅ I will automatically:

  • Address comments on this PR. Add '(aside)' to your comment to have me ignore it.
  • Look at CI failures and help fix them

Note: I can only respond to comments from users who have write access to this repository.

⚙️ Control Options:

  • Disable automatic comment and CI monitoring

@linear
Copy link

linear bot commented Mar 25, 2026

@github-actions
Copy link
Contributor

👋 @devin-ai-integration[bot]
Thank you for raising your pull request.
Please make sure to add tests and document all user-facing changes.
You can do this by editing the docs files in the elementary repository.

@coderabbitai
Copy link

coderabbitai bot commented Mar 25, 2026

📝 Walkthrough

Walkthrough

Added an adapter-dispatched macro get_node_database() with a ClickHouse fallback and updated multiple macros to use it for deriving node database names instead of directly reading node.get("database").

Changes

Cohort / File(s) Summary
New Node Database Utility
macros/utils/graph/get_node_database.sql
Added get_node_database(node) (adapter.dispatch), with default__get_node_database returning node.get("database") and clickhouse__get_node_database falling back to node.get("schema") when needed.
EDR / dbt artifact upload macros
macros/edr/dbt_artifacts/upload_dbt_columns.sql, macros/edr/dbt_artifacts/upload_dbt_models.sql, macros/edr/dbt_artifacts/upload_dbt_seeds.sql, macros/edr/dbt_artifacts/upload_dbt_sources.sql, macros/edr/dbt_artifacts/upload_dbt_tests.sql
Replaced direct node.get("database") lookups with elementary.get_node_database(node) for database_name (or primary_test_model_database) computation; affects emitted artifact database_name values and related metadata hashing.
Graph utilities
macros/utils/graph/get_model_database_and_schema_from_test_node.sql, macros/utils/graph/get_relevant_databases.sql
Delegated test/database extraction to elementary.get_node_database(node) and adjusted initialization of target database (use target.database or fallback to target.schema).

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐇 I hopped through macros, tidy and spry,
Pulled database names from one single sky,
ClickHouse a soft fallback when schema's the key,
Seven friends updated — neat as can be! ✨

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Title check ✅ Passed The title clearly and accurately describes the main change: introducing a fix for populating database_name from schema for ClickHouse, which is the core objective of the PR.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch core-559-fix-null-database-name-clickhouse

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
macros/utils/graph/get_relevant_databases.sql (1)

2-2: Pre-existing issue: target.database may also be None on ClickHouse.

As noted in the PR checklist, Line 2 still reads target.database directly. On ClickHouse, this could be None, resulting in None appearing in the returned database list. While this PR focuses on node-level extraction, consider a follow-up to apply similar fallback logic here (e.g., target.database or target.schema).

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@macros/utils/graph/get_relevant_databases.sql` at line 2, The list assignment
for database_names uses target.database directly which can be None on
ClickHouse; update the macro in get_relevant_databases.sql to fallback to
target.schema when target.database is falsy (e.g., set database_names to
[target.database or target.schema]) so the returned list never contains None;
ensure you update any references that expect database_names to remain a list of
strings and keep the same variable name database_names for compatibility.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@macros/utils/graph/get_relevant_databases.sql`:
- Line 2: The list assignment for database_names uses target.database directly
which can be None on ClickHouse; update the macro in get_relevant_databases.sql
to fallback to target.schema when target.database is falsy (e.g., set
database_names to [target.database or target.schema]) so the returned list never
contains None; ensure you update any references that expect database_names to
remain a list of strings and keep the same variable name database_names for
compatibility.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 777f2573-b8b7-4b98-8d09-88abc9a812ae

📥 Commits

Reviewing files that changed from the base of the PR and between b8c7ab0 and 83c0a84.

📒 Files selected for processing (8)
  • macros/edr/dbt_artifacts/upload_dbt_columns.sql
  • macros/edr/dbt_artifacts/upload_dbt_models.sql
  • macros/edr/dbt_artifacts/upload_dbt_seeds.sql
  • macros/edr/dbt_artifacts/upload_dbt_sources.sql
  • macros/edr/dbt_artifacts/upload_dbt_tests.sql
  • macros/utils/graph/get_model_database_and_schema_from_test_node.sql
  • macros/utils/graph/get_node_database.sql
  • macros/utils/graph/get_relevant_databases.sql

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant