[PECOBLR-1746] Implementing support for listing procedures#1238
[PECOBLR-1746] Implementing support for listing procedures#1238msrathore-db merged 17 commits intodatabricks:mainfrom
Conversation
33cc3e9 to
32e1fcd
Compare
| sql.append(" AND routine_schema LIKE '").append(schemaPattern).append("'"); | ||
| } | ||
| if (procedureNamePattern != null) { | ||
| sql.append(" AND routine_name LIKE '").append(procedureNamePattern).append("'"); |
There was a problem hiding this comment.
Is it possible to use parameterized statements here and provide user provided values as parameters? This will prevent SQL injection issues
There was a problem hiding this comment.
Done. Now using ? placeholders with ImmutableSqlParameter for server-side binding.
| } | ||
|
|
||
| private static String getCatalogPrefix(String catalog) { | ||
| return (catalog == null) ? "system" : "`" + catalog + "`"; |
There was a problem hiding this comment.
what if catalogName already contains a backtick character? Should we escape that?
There was a problem hiding this comment.
Catalog names cannot contain backticks, so the user input would be invalid in that case. No escaping needed.
| return (short) procedureColumnUnknown; | ||
| } | ||
| switch (parameterMode.toUpperCase()) { | ||
| case "IN": |
There was a problem hiding this comment.
Done. Extracted PARAM_MODE_IN, PARAM_MODE_OUT, PARAM_MODE_INOUT, and IS_RESULT_YES as constants.
| try { | ||
| Object val = resultSet.getObject(columnName); | ||
| return val != null ? val.toString() : null; | ||
| } catch (SQLException e) { |
There was a problem hiding this comment.
Done. Added debug logging to getRowsForProcedures, getRowsForProcedureColumns, and mapParameterModeToColumnType.
| if (val == null) return null; | ||
| if (val instanceof Number) return ((Number) val).intValue(); | ||
| return Integer.parseInt(val.toString()); | ||
| } catch (SQLException | NumberFormatException e) { |
There was a problem hiding this comment.
Done — see above.
| if (val == null) return null; | ||
| if (val instanceof Number) return ((Number) val).shortValue(); | ||
| return Short.parseShort(val.toString()); | ||
| } catch (SQLException | NumberFormatException e) { |
There was a problem hiding this comment.
Done — see above.
| Integer charMaxLength = getIntOrNull(resultSet, "character_maximum_length"); | ||
| Integer charOctetLength = getIntOrNull(resultSet, "character_octet_length"); | ||
| row.add(numericPrecision != null ? numericPrecision : charMaxLength); // PRECISION | ||
| row.add(charOctetLength != null ? charOctetLength : numericPrecision); // LENGTH |
There was a problem hiding this comment.
but numericPrecision can be null also
There was a problem hiding this comment.
Correct — if both numericPrecision and charMaxLength are null, COLUMN_SIZE returns null, which is the expected behavior per JDBC spec for types where precision is not applicable.
| Integer numericPrecision = getIntOrNull(resultSet, "numeric_precision"); | ||
| Integer charMaxLength = getIntOrNull(resultSet, "character_maximum_length"); | ||
| Integer charOctetLength = getIntOrNull(resultSet, "character_octet_length"); | ||
| row.add(numericPrecision != null ? numericPrecision : charMaxLength); // PRECISION |
There was a problem hiding this comment.
charMaxLength itself can be null
There was a problem hiding this comment.
Same as above — null fallback is intentional. Returns null when neither numeric precision nor character length applies to the type.
| Integer charMaxLength = getIntOrNull(resultSet, "character_maximum_length"); | ||
| Integer charOctetLength = getIntOrNull(resultSet, "character_octet_length"); | ||
| row.add(numericPrecision != null ? numericPrecision : charMaxLength); // PRECISION | ||
| row.add(charOctetLength != null ? charOctetLength : numericPrecision); // LENGTH |
There was a problem hiding this comment.
why is length being fallback to precision?
There was a problem hiding this comment.
Fixed. BUFFER_LENGTH now uses charOctetLength directly (null for non-character types) instead of falling back to numericPrecision.
| Arguments.of( | ||
| "SELECT p.specific_catalog, p.specific_schema, p.specific_name," | ||
| + " p.parameter_name, p.parameter_mode, p.is_result," | ||
| + " p.data_type, p.full_data_type," |
There was a problem hiding this comment.
where is full_data_type used?
There was a problem hiding this comment.
Removed. It was unused — numeric_precision and numeric_scale are available as separate columns so full_data_type is not needed.
…SQL queries Implement JDBC-compliant getProcedures and getProcedureColumns by querying information_schema.ROUTINES and information_schema.parameters via SQL. This approach works for both thrift and SEA transports without using thrift RPC. - getProcedures: queries ROUTINES filtered by routine_type='PROCEDURE' - getProcedureColumns: queries parameters joined with ROUTINES for procedures - Null catalog uses system.information_schema; specific catalog uses <catalog>.information_schema - Shared SQL builders in CommandConstants to avoid duplication across SDK/Thrift clients - Maps parameter_mode (IN/OUT/INOUT) to JDBC COLUMN_TYPE constants - Maps Databricks type names to java.sql.Types via existing getCode() Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…line helpers - Extract table names, column lists, and filter into named constants in CommandConstants - Remove getProcedureColumnPrecision/getProcedureColumnLength helpers with unused dataType param - Inline precision/length logic directly in getRowsForProcedureColumns Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ocedureColumns Unit tests (DatabricksMetadataSdkClientTest): - testListProcedures: parameterized tests for SQL generation with various catalog/schema/name combinations including null catalog (system prefix) - testListProcedureColumns: parameterized tests for SQL generation with JOIN to routines table, all filter combinations Integration test (MetadataIntegrationTests): - testGetProceduresAndProcedureColumns: creates a test procedure with IN and OUT params, verifies getProcedures returns correct metadata (name, remarks, type), verifies getProcedureColumns returns correct parameter info (column types IN=1/OUT=4, data types, ordinal positions), tests column name filtering, cleans up procedure after test Also renames ROUTINES -> routines in CommandConstants for consistency. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…I changes Add GET_PROCEDURES and GET_PROCEDURE_COLUMNS to MetadataOperationType enum. Update getResultSet/executeStatement calls to pass the new required MetadataOperationType parameter after upstream API signature change. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
String.format treats the single quote before %s as a flag character, causing UnknownFormatConversion when patterns like '%' are passed. Replace String.format with StringBuilder.append for SQL LIKE clauses. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
… '%'
The LOGGER.error(e, "message {}", e) call uses String.format internally.
When the exception message contains SQL with LIKE '%', the '%' is
interpreted as a format specifier causing UnknownFormatConversionException.
Use plain log message without exception interpolation.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…m values Add GET_PROCEDURES and GET_PROCEDURE_COLUMNS to the enum count assertion and parameterized header value tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Replace string concatenation with ? placeholders and ImmutableSqlParameter for all LIKE clause values in buildProceduresSQL and buildProcedureColumnsSQL. This prevents SQL injection via user-provided schema/procedure/column patterns. The build methods now accept a Map<Integer, ImmutableSqlParameter> that they populate with the parameter bindings. Callers pass this map through to executeStatement instead of an empty HashMap. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…o prevent injection Use ? placeholders with ImmutableSqlParameter for LIKE clause values in buildProceduresSQL and buildProcedureColumnsSQL. Parameters are passed to the server via executeStatement's parameter map for server-side binding, the same mechanism used by PreparedStatement. Also resolves merge conflict with upstream rename of DatabricksMetadataSdkClient to DatabricksMetadataQueryClient. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Extract string constants for parameter mode values (IN/OUT/INOUT/YES) - Add debug logging to getRowsForProcedures, getRowsForProcedureColumns, and mapParameterModeToColumnType - Fix BUFFER_LENGTH: use charOctetLength directly instead of falling back to numericPrecision which is semantically wrong - Remove unused full_data_type from PARAMETERS_SELECT_COLUMNS Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ries The SQL now uses ? placeholders with server-side parameter binding, so the WireMock stubs need to match the new request format. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
fbbf4db to
02b19f1
Compare
… and fix column count The PARAMETERS_SELECT_COLUMNS no longer includes full_data_type (removed per PR review), so unit test expected SQL strings and column count mock need to match. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
| int paramIndex = 1; | ||
|
|
||
| StringBuilder sql = new StringBuilder(); | ||
| sql.append("SELECT ").append(PARAMETERS_SELECT_COLUMNS); |
There was a problem hiding this comment.
can we use utility like mybatis-3 for creating such SQL strings?
| List<List<Object>> rows = new ArrayList<>(); | ||
| while (resultSet.next()) { | ||
| List<Object> row = new ArrayList<>(); | ||
| row.add(getStringOrNull(resultSet, "routine_catalog")); // PROCEDURE_CAT |
There was a problem hiding this comment.
declare as constants, all of hard coded values
There was a problem hiding this comment.
Done. Extracted all column name strings to private static final constants (COL_ROUTINE_CATALOG, COL_SPECIFIC_NAME, COL_DATA_TYPE, etc.).
| LOGGER.debug("Building rows for getProcedureColumns result set"); | ||
| List<List<Object>> rows = new ArrayList<>(); | ||
| while (resultSet.next()) { | ||
| String dataType = getStringOrNull(resultSet, "data_type"); |
There was a problem hiding this comment.
Done — see above.
| String isResult = getStringOrNull(resultSet, "is_result"); | ||
|
|
||
| List<Object> row = new ArrayList<>(); | ||
| row.add(getStringOrNull(resultSet, "specific_catalog")); // PROCEDURE_CAT |
There was a problem hiding this comment.
is null value allowed for catalog and schema?
There was a problem hiding this comment.
Yes, per the JDBC spec both PROCEDURE_CAT and PROCEDURE_SCHEM are nullable. From the spec: "PROCEDURE_CAT String => procedure catalog (may be null)" and "PROCEDURE_SCHEM String => procedure schema (may be null)".
| row.add(getStringOrNull(resultSet, "parameter_default")); // COLUMN_DEF | ||
| row.add(null); // SQL_DATA_TYPE (reserved) | ||
| row.add(null); // SQL_DATETIME_SUB (reserved) | ||
| row.add(getIntOrNull(resultSet, "character_octet_length")); // CHAR_OCTET_LENGTH |
There was a problem hiding this comment.
is this same as buffer_length? and you are reading it again
There was a problem hiding this comment.
Good catch. Fixed — now reusing the charOctetLength variable for both BUFFER_LENGTH (col 9) and CHAR_OCTET_LENGTH (col 17) instead of calling getIntOrNull twice.
…e charOctetLength - Extract all hardcoded column name strings to private static final constants (COL_ROUTINE_CATALOG, COL_SPECIFIC_NAME, COL_DATA_TYPE, etc.) - Reuse charOctetLength variable for CHAR_OCTET_LENGTH instead of calling getIntOrNull a second time - Add nullable annotations in comments per JDBC spec Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…eColumns Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
## Summary - Implement `getProcedures`, `getProcedureColumns` metadata operations - Follow the existing `getTables`/`getColumns` pattern with trait methods, SQL builders, SeaClient impl, FFI functions, and C header declarations - Add comprehensive E2E metadata tests covering all metadata operations ### getProcedures - Queries `information_schema.routines` filtered by `routine_type = 'PROCEDURE'` - Catalog resolution: `NULL` → `system` (cross-catalog), specific value → that catalog, empty string → empty result - Pattern filtering via SQL `LIKE` on `routine_schema` and `routine_name` ### getProcedureColumns - Queries `information_schema.parameters` joined with `information_schema.routines` - Same catalog resolution and pattern filtering as getProcedures - Selects all columns needed for ODBC `SQLProcedureColumns`: parameter_name, parameter_mode, is_result, data_type, full_data_type, numeric_precision/scale, character_maximum/octet_length, ordinal_position, parameter_default, comment ### getCrossReferences - Uses `SHOW FOREIGN KEYS` (same query as `getForeignKeys`) - Takes all 6 ODBC `SQLCrossReferences` parameters (pk_catalog/schema/table + fk_catalog/schema/table) - Parent table filtering is done client-side by the ODBC C++ layer ### Design references - JDBC implementation: databricks/databricks-jdbc#1238 - Design doc: [SQLProcedures and SQLProcedureColumns Support](https://docs.google.com/document/d/1WV1hOiJA8Obs9q3o47u7cvZCwnRvKg8kWI8gZhuFFaY) ## Test plan - [x] 12 unit tests for SQL builders (catalog resolution, pattern filters, escaping, column selection) - [x] 12 E2E metadata tests (`#[ignore]`, require real Databricks connection) covering all operations: catalogs, schemas, tables, columns, primary keys, foreign keys, cross-references, procedures, procedure columns - [x] `cargo test` — 237 tests pass - [x] `cargo clippy --all-targets -- -D warnings` — clean - [x] `cargo +stable fmt --all` — clean - [ ] E2E validation against live workspace (blocked on token — tests compile and are properly `#[ignore]`d) This pull request was AI-assisted by Isaac.
Description
Implements
getProceduresandgetProcedureColumnsinDatabricksDatabaseMetaDataby queryinginformation_schema.routinesandinformation_schema.parametersvia SQL.Unlike other metadata operations that use
SHOWcommands or Thrift RPCs, these use direct SQL SELECT queries againstinformation_schemaviews. This works for both Thrift and SEA transports.getProcedures:
information_schema.routinesfiltered byroutine_type = 'PROCEDURE'procedureNoResult(1)getProcedureColumns:
information_schema.parametersJOINed withinformation_schema.routinesto filter for proceduresparameter_mode(IN/OUT/INOUT) to JDBC COLUMN_TYPE constants (1/4/2)java.sql.Typescodes via existinggetCode()Catalog resolution:
system.information_schema.routines(cross-catalog)<catalog>.information_schema.routinesShared SQL builders in
CommandConstantseliminate duplication between SDK and Thrift clients.Testing
Unit tests (
DatabricksMetadataSdkClientTest):listProceduresSQL generation (catalog+schema+name, null schema, null name, null catalog)listProcedureColumnsSQL generation (all filters, partial filters, all nulls)Integration test (
MetadataIntegrationTests#testGetProceduresAndProcedureColumns):jdbc_test_compute_area(x DOUBLE, y DOUBLE, OUT area DOUBLE)with COMMENTgetProceduresreturns correct name, schema, catalog, remarks, typegetProcedureColumnsreturns 3 params with correct COLUMN_TYPE (IN=1, OUT=4), DATA_TYPE (DOUBLE=8), ordinal positionsExisting tests: All 248
DatabricksDatabaseMetaDataTest+ 51DatabricksMetadataSdkClientTestpass.Additional Notes to the Reviewer
information_schema.parametersis used (notroutine_columns) becauseroutine_columnscontains table-valued function output columns, not procedure parameters.procedureNullableUnknown(2) since the server does not track parameter nullability.system.information_schemais queried which requires system table access permissions. If the user lacks this permission, the driver returns an empty result set. This will be addressed server-side in a future release.