Skip to content

Commit c4eeec1

Browse files
committed
Implement Java scalar UDF and table function support
1 parent fd4d7ed commit c4eeec1

45 files changed

Lines changed: 11274 additions & 562 deletions

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

CMakeLists.txt

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -591,13 +591,16 @@ add_library(duckdb_java SHARED
591591
src/jni/bindings_common.cpp
592592
src/jni/bindings_data_chunk.cpp
593593
src/jni/bindings_logical_type.cpp
594+
src/jni/bindings_scalar_function.cpp
595+
src/jni/bindings_table_function.cpp
594596
src/jni/bindings_validity.cpp
595597
src/jni/bindings_vector.cpp
596598
src/jni/config.cpp
597599
src/jni/duckdb_java.cpp
598600
src/jni/functions.cpp
599601
src/jni/refs.cpp
600602
src/jni/types.cpp
603+
src/jni/udf_registration.cpp
601604
src/jni/util.cpp
602605
${DUCKDB_SRC_FILES})
603606

README.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -20,3 +20,7 @@ This optionally takes an argument to only run a single test, for example:
2020
```
2121
java -cp "build/release/duckdb_jdbc_tests.jar:build/release/duckdb_jdbc.jar" org/duckdb/TestDuckDBJDBC test_valid_but_local_config_throws_exception
2222
```
23+
24+
### User-Defined Functions (Java)
25+
26+
All Java UDF documentation and examples are available in [UDF.MD](UDF.MD).

UDF.MD

Lines changed: 212 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,212 @@
1+
# User-Defined Functions (Java)
2+
3+
This guide shows how to use Java Scalar UDFs and Table Functions with `DuckDBConnection`.
4+
5+
## Scalar UDF
6+
7+
Scalar UDF callbacks use a vectorized contract:
8+
9+
```java
10+
ScalarUdf.apply(UdfContext ctx, UdfReader[] args, UdfScalarWriter out, int rowCount)
11+
```
12+
13+
Use `rowCount` loops and write one output value per row.
14+
15+
### Basic example
16+
17+
```java
18+
try (DuckDBConnection conn = DriverManager.getConnection("jdbc:duckdb:").unwrap(DuckDBConnection.class);
19+
Statement stmt = conn.createStatement()) {
20+
21+
conn.registerScalarUdf("add_one", DuckDBColumnType.INTEGER, DuckDBColumnType.INTEGER,
22+
(ctx, args, out, rowCount) -> {
23+
for (int row = 0; row < rowCount; row++) {
24+
out.setInt(row, args[0].getInt(row) + 1);
25+
}
26+
});
27+
28+
try (ResultSet rs = stmt.executeQuery("SELECT add_one(41)")) {
29+
rs.next();
30+
System.out.println(rs.getInt(1)); // 42
31+
}
32+
}
33+
```
34+
35+
### Registration forms
36+
37+
You can register scalar UDFs with:
38+
39+
- `DuckDBColumnType` signatures (`registerScalarUdf`)
40+
- `Class<?>` signatures (`registerScalarUdf`)
41+
- explicit `UdfLogicalType` signatures (`registerScalarUdf`)
42+
- varargs signatures (`registerScalarUdfVarArgs`)
43+
44+
For decimal precision/scale, prefer explicit logical types:
45+
46+
```java
47+
conn.registerScalarUdf(
48+
"mul_decimal",
49+
new UdfLogicalType[] {UdfLogicalType.decimal(20, 4), UdfLogicalType.decimal(20, 4)},
50+
UdfLogicalType.decimal(38, 8),
51+
(ctx, args, out, rowCount) -> {
52+
for (int row = 0; row < rowCount; row++) {
53+
out.setBigDecimal(row, args[0].getBigDecimal(row).multiply(args[1].getBigDecimal(row)));
54+
}
55+
}
56+
);
57+
```
58+
59+
### Options
60+
61+
`UdfOptions` controls scalar behavior:
62+
63+
- `deterministic(true|false)`: marks whether equal inputs always produce equal output. Use `false` for non-deterministic logic (for example random/time-based behavior).
64+
- `nullSpecialHandling(true|false)`: when `true`, your callback receives rows that contain `NULL` input values; when `false`, DuckDB handles null propagation before callback execution.
65+
- `returnNullOnException(true|false)`: when `true`, Java exceptions in callback rows are returned as `NULL`; when `false`, the query fails with an error.
66+
- `varArgs(true|false)`: enables varargs registration (normally used via `registerScalarUdfVarArgs`).
67+
68+
Example:
69+
70+
```java
71+
UdfOptions options = new UdfOptions()
72+
.deterministic(true)
73+
.nullSpecialHandling(true)
74+
.returnNullOnException(false);
75+
76+
conn.registerScalarUdf("safe_add", DuckDBColumnType.INTEGER, DuckDBColumnType.INTEGER,
77+
(ctx, args, out, rowCount) -> {
78+
for (int row = 0; row < rowCount; row++) {
79+
if (args[0].isNull(row)) {
80+
out.setNull(row);
81+
} else {
82+
out.setInt(row, args[0].getInt(row) + 1);
83+
}
84+
}
85+
}, options);
86+
```
87+
88+
## UdfReader / UdfScalarWriter object mappings
89+
90+
| DuckDB type | Reader object | Writer object |
91+
| --- | --- | --- |
92+
| `BOOLEAN` | `Boolean` | `Boolean` |
93+
| `TINYINT`, `SMALLINT`, `INTEGER`, `UTINYINT`, `USMALLINT` | `Integer` | `Integer` |
94+
| `BIGINT`, `UINTEGER`, `UBIGINT` | `Long` | `Long` |
95+
| `FLOAT` | `Float` | `Float` |
96+
| `DOUBLE` | `Double` | `Double` |
97+
| `DECIMAL` | `BigDecimal` | `BigDecimal` |
98+
| `VARCHAR` | `String` | `String` |
99+
| `BLOB` | `byte[]` | `byte[]` |
100+
| `DATE` | `LocalDate` or `Date` | `LocalDate` or `Date` |
101+
| `TIME`, `TIME_NS` | `LocalTime` | `LocalTime` |
102+
| `TIME_WITH_TIME_ZONE` | `OffsetTime` | `OffsetTime` |
103+
| `TIMESTAMP`, `TIMESTAMP_S`, `TIMESTAMP_MS`, `TIMESTAMP_NS` | `LocalDateTime` | `LocalDateTime` or `Date` |
104+
| `TIMESTAMP_WITH_TIME_ZONE` | `OffsetDateTime` | `OffsetDateTime` or `Date` |
105+
| `UUID` | `UUID` | `UUID` |
106+
| `HUGEINT`, `UHUGEINT` | `byte[]` | `byte[]` |
107+
108+
`UdfScalarWriter` supports explicit setters and `setObject(...)`.
109+
110+
## Table Function
111+
112+
Table function callbacks use:
113+
114+
- `bind(BindContext ctx, Object[] parameters) -> TableBindResult`
115+
- `init(InitContext ctx, TableBindResult bind) -> TableState`
116+
- `produce(TableState state, UdfOutputAppender out) -> int`
117+
118+
What each callback does:
119+
120+
- `bind`: runs once per invocation to validate/interpret parameters, define output schema, and create bind state.
121+
- `init`: runs after bind to initialize execution state (cursor/counters/chunk state).
122+
- `produce`: runs repeatedly to emit rows in chunks; return the number of rows produced in that call.
123+
124+
### Basic example
125+
126+
```java
127+
conn.registerTableFunction(
128+
"range_java",
129+
new TableFunction() {
130+
@Override
131+
public TableBindResult bind(BindContext ctx, Object[] parameters) {
132+
long end = ((Number) parameters[0]).longValue();
133+
return new TableBindResult(
134+
new String[] {"i"},
135+
new UdfLogicalType[] {UdfLogicalType.of(DuckDBColumnType.BIGINT)},
136+
new long[] {0L, end}
137+
);
138+
}
139+
140+
@Override
141+
public TableState init(InitContext ctx, TableBindResult bind) {
142+
return new TableState(bind.getBindState());
143+
}
144+
145+
@Override
146+
public int produce(TableState state, UdfOutputAppender out) {
147+
long[] st = (long[]) state.getState();
148+
long current = st[0];
149+
long end = st[1];
150+
int produced = 0;
151+
152+
while (produced < 256 && current < end) {
153+
out.beginRow().append(current).endRow();
154+
current++;
155+
produced++;
156+
}
157+
158+
st[0] = current;
159+
return produced;
160+
}
161+
},
162+
new TableFunctionDefinition().withParameterTypes(new DuckDBColumnType[] {DuckDBColumnType.BIGINT}),
163+
new TableFunctionOptions().threadSafe(false).maxThreads(1)
164+
);
165+
```
166+
167+
### Bind parameter object mappings
168+
169+
In `bind`, parameters are materialized as Java objects. Common mappings:
170+
171+
- `DECIMAL -> BigDecimal`
172+
- `DATE -> LocalDate`
173+
- `TIME`, `TIME_NS -> LocalTime`
174+
- `TIMESTAMP* -> LocalDateTime`
175+
- `TIME_WITH_TIME_ZONE -> OffsetTime`
176+
- `TIMESTAMP_WITH_TIME_ZONE -> OffsetDateTime`
177+
- `UUID -> UUID`
178+
179+
### Output writing with UdfOutputAppender
180+
181+
`UdfOutputAppender` supports:
182+
183+
- primitive/object `append(...)` for one column at a time
184+
- `setObject(...)` and typed setters (`setBigDecimal`, `setLocalDate`, etc.)
185+
- nested output objects for container types:
186+
- `LIST`/`ARRAY`: Java arrays or `Collection`
187+
- `MAP`: `Map`
188+
- `STRUCT`: positional `List`/array or named `Map<String, Object>`
189+
- `UNION`: `AbstractMap.SimpleEntry<String, Object>`
190+
- `ENUM`: `String`
191+
192+
## Table function options
193+
194+
`TableFunctionOptions`:
195+
196+
- `threadSafe(false|true)`
197+
- `maxThreads(int >= 1)`
198+
199+
`TableFunctionDefinition`:
200+
201+
- `withParameterTypes(...)`
202+
- `withProjectionPushdown(true|false)`
203+
204+
## Unsupported in scalar signatures
205+
206+
Scalar UDF signatures do not support nested/container logical types (`LIST`, `STRUCT`, `MAP`, `ARRAY`, `UNION`, `ENUM`) and `INTERVAL`.
207+
208+
## Practical recommendations
209+
210+
- Use chunk-oriented loops (`rowCount`) for scalar UDF throughput.
211+
- Avoid executing SQL on the same `DuckDBConnection` from inside callbacks.
212+
- Use explicit logical types for decimal-sensitive workloads.

duckdb_java.def

Lines changed: 38 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -50,7 +50,6 @@ Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1set_1auto_1commit
5050
Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1set_1catalog
5151
Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1set_1schema
5252
Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1startup
53-
5453
Java_org_duckdb_DuckDBBindings_duckdb_1vector_1size
5554
Java_org_duckdb_DuckDBBindings_duckdb_1create_1logical_1type
5655
Java_org_duckdb_DuckDBBindings_duckdb_1get_1type_1id
@@ -82,6 +81,12 @@ Java_org_duckdb_DuckDBBindings_duckdb_1list_1vector_1set_1size
8281
Java_org_duckdb_DuckDBBindings_duckdb_1list_1vector_1reserve
8382
Java_org_duckdb_DuckDBBindings_duckdb_1struct_1vector_1get_1child
8483
Java_org_duckdb_DuckDBBindings_duckdb_1array_1vector_1get_1child
84+
Java_org_duckdb_DuckDBBindings_duckdb_1udf_1get_1varchar_1bytes
85+
Java_org_duckdb_DuckDBBindings_duckdb_1udf_1set_1varchar_1bytes
86+
Java_org_duckdb_DuckDBBindings_duckdb_1udf_1get_1blob_1bytes
87+
Java_org_duckdb_DuckDBBindings_duckdb_1udf_1set_1blob_1bytes
88+
Java_org_duckdb_DuckDBBindings_duckdb_1udf_1get_1decimal
89+
Java_org_duckdb_DuckDBBindings_duckdb_1udf_1set_1decimal
8590
Java_org_duckdb_DuckDBBindings_duckdb_1create_1data_1chunk
8691
Java_org_duckdb_DuckDBBindings_duckdb_1destroy_1data_1chunk
8792
Java_org_duckdb_DuckDBBindings_duckdb_1data_1chunk_1reset
@@ -98,6 +103,38 @@ Java_org_duckdb_DuckDBBindings_duckdb_1appender_1column_1count
98103
Java_org_duckdb_DuckDBBindings_duckdb_1appender_1column_1type
99104
Java_org_duckdb_DuckDBBindings_duckdb_1append_1data_1chunk
100105
Java_org_duckdb_DuckDBBindings_duckdb_1append_1default_1to_1chunk
106+
Java_org_duckdb_DuckDBBindings_duckdb_1create_1scalar_1function
107+
Java_org_duckdb_DuckDBBindings_duckdb_1destroy_1scalar_1function
108+
Java_org_duckdb_DuckDBBindings_duckdb_1scalar_1function_1set_1name
109+
Java_org_duckdb_DuckDBBindings_duckdb_1scalar_1function_1add_1parameter
110+
Java_org_duckdb_DuckDBBindings_duckdb_1scalar_1function_1set_1return_1type
111+
Java_org_duckdb_DuckDBBindings_duckdb_1scalar_1function_1set_1volatile
112+
Java_org_duckdb_DuckDBBindings_duckdb_1scalar_1function_1set_1special_1handling
113+
Java_org_duckdb_DuckDBBindings_duckdb_1register_1scalar_1function
114+
Java_org_duckdb_DuckDBBindings_duckdb_1register_1scalar_1function_1java
115+
Java_org_duckdb_DuckDBBindings_duckdb_1register_1scalar_1function_1java_1with_1function
116+
Java_org_duckdb_DuckDBBindings_duckdb_1create_1table_1function
117+
Java_org_duckdb_DuckDBBindings_duckdb_1destroy_1table_1function
118+
Java_org_duckdb_DuckDBBindings_duckdb_1table_1function_1set_1name
119+
Java_org_duckdb_DuckDBBindings_duckdb_1table_1function_1add_1parameter
120+
Java_org_duckdb_DuckDBBindings_duckdb_1table_1function_1supports_1projection_1pushdown
121+
Java_org_duckdb_DuckDBBindings_duckdb_1register_1table_1function
122+
Java_org_duckdb_DuckDBBindings_duckdb_1register_1table_1function_1java
123+
Java_org_duckdb_DuckDBBindings_duckdb_1register_1table_1function_1java_1with_1function
124+
Java_org_duckdb_DuckDBBindings_duckdb_1bind_1get_1parameter_1count
125+
Java_org_duckdb_DuckDBBindings_duckdb_1bind_1get_1parameter
126+
Java_org_duckdb_DuckDBBindings_duckdb_1bind_1add_1result_1column
127+
Java_org_duckdb_DuckDBBindings_duckdb_1bind_1set_1bind_1data
128+
Java_org_duckdb_DuckDBBindings_duckdb_1bind_1set_1error
129+
Java_org_duckdb_DuckDBBindings_duckdb_1init_1set_1init_1data
130+
Java_org_duckdb_DuckDBBindings_duckdb_1init_1get_1column_1count
131+
Java_org_duckdb_DuckDBBindings_duckdb_1init_1get_1column_1index
132+
Java_org_duckdb_DuckDBBindings_duckdb_1init_1set_1max_1threads
133+
Java_org_duckdb_DuckDBBindings_duckdb_1init_1set_1error
134+
Java_org_duckdb_DuckDBBindings_duckdb_1function_1get_1bind_1data
135+
Java_org_duckdb_DuckDBBindings_duckdb_1function_1get_1init_1data
136+
Java_org_duckdb_DuckDBBindings_duckdb_1function_1get_1local_1init_1data
137+
Java_org_duckdb_DuckDBBindings_duckdb_1function_1set_1error
101138

102139
duckdb_adbc_init
103140
duckdb_add_aggregate_function_to_set

duckdb_java.exp

Lines changed: 38 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -47,7 +47,6 @@ _Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1set_1auto_1commit
4747
_Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1set_1catalog
4848
_Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1set_1schema
4949
_Java_org_duckdb_DuckDBNative_duckdb_1jdbc_1startup
50-
5150
_Java_org_duckdb_DuckDBBindings_duckdb_1vector_1size
5251
_Java_org_duckdb_DuckDBBindings_duckdb_1create_1logical_1type
5352
_Java_org_duckdb_DuckDBBindings_duckdb_1get_1type_1id
@@ -79,6 +78,12 @@ _Java_org_duckdb_DuckDBBindings_duckdb_1list_1vector_1set_1size
7978
_Java_org_duckdb_DuckDBBindings_duckdb_1list_1vector_1reserve
8079
_Java_org_duckdb_DuckDBBindings_duckdb_1struct_1vector_1get_1child
8180
_Java_org_duckdb_DuckDBBindings_duckdb_1array_1vector_1get_1child
81+
_Java_org_duckdb_DuckDBBindings_duckdb_1udf_1get_1varchar_1bytes
82+
_Java_org_duckdb_DuckDBBindings_duckdb_1udf_1set_1varchar_1bytes
83+
_Java_org_duckdb_DuckDBBindings_duckdb_1udf_1get_1blob_1bytes
84+
_Java_org_duckdb_DuckDBBindings_duckdb_1udf_1set_1blob_1bytes
85+
_Java_org_duckdb_DuckDBBindings_duckdb_1udf_1get_1decimal
86+
_Java_org_duckdb_DuckDBBindings_duckdb_1udf_1set_1decimal
8287
_Java_org_duckdb_DuckDBBindings_duckdb_1create_1data_1chunk
8388
_Java_org_duckdb_DuckDBBindings_duckdb_1destroy_1data_1chunk
8489
_Java_org_duckdb_DuckDBBindings_duckdb_1data_1chunk_1reset
@@ -95,6 +100,38 @@ _Java_org_duckdb_DuckDBBindings_duckdb_1appender_1column_1count
95100
_Java_org_duckdb_DuckDBBindings_duckdb_1appender_1column_1type
96101
_Java_org_duckdb_DuckDBBindings_duckdb_1append_1data_1chunk
97102
_Java_org_duckdb_DuckDBBindings_duckdb_1append_1default_1to_1chunk
103+
_Java_org_duckdb_DuckDBBindings_duckdb_1create_1scalar_1function
104+
_Java_org_duckdb_DuckDBBindings_duckdb_1destroy_1scalar_1function
105+
_Java_org_duckdb_DuckDBBindings_duckdb_1scalar_1function_1set_1name
106+
_Java_org_duckdb_DuckDBBindings_duckdb_1scalar_1function_1add_1parameter
107+
_Java_org_duckdb_DuckDBBindings_duckdb_1scalar_1function_1set_1return_1type
108+
_Java_org_duckdb_DuckDBBindings_duckdb_1scalar_1function_1set_1volatile
109+
_Java_org_duckdb_DuckDBBindings_duckdb_1scalar_1function_1set_1special_1handling
110+
_Java_org_duckdb_DuckDBBindings_duckdb_1register_1scalar_1function
111+
_Java_org_duckdb_DuckDBBindings_duckdb_1register_1scalar_1function_1java
112+
_Java_org_duckdb_DuckDBBindings_duckdb_1register_1scalar_1function_1java_1with_1function
113+
_Java_org_duckdb_DuckDBBindings_duckdb_1create_1table_1function
114+
_Java_org_duckdb_DuckDBBindings_duckdb_1destroy_1table_1function
115+
_Java_org_duckdb_DuckDBBindings_duckdb_1table_1function_1set_1name
116+
_Java_org_duckdb_DuckDBBindings_duckdb_1table_1function_1add_1parameter
117+
_Java_org_duckdb_DuckDBBindings_duckdb_1table_1function_1supports_1projection_1pushdown
118+
_Java_org_duckdb_DuckDBBindings_duckdb_1register_1table_1function
119+
_Java_org_duckdb_DuckDBBindings_duckdb_1register_1table_1function_1java
120+
_Java_org_duckdb_DuckDBBindings_duckdb_1register_1table_1function_1java_1with_1function
121+
_Java_org_duckdb_DuckDBBindings_duckdb_1bind_1get_1parameter_1count
122+
_Java_org_duckdb_DuckDBBindings_duckdb_1bind_1get_1parameter
123+
_Java_org_duckdb_DuckDBBindings_duckdb_1bind_1add_1result_1column
124+
_Java_org_duckdb_DuckDBBindings_duckdb_1bind_1set_1bind_1data
125+
_Java_org_duckdb_DuckDBBindings_duckdb_1bind_1set_1error
126+
_Java_org_duckdb_DuckDBBindings_duckdb_1init_1set_1init_1data
127+
_Java_org_duckdb_DuckDBBindings_duckdb_1init_1get_1column_1count
128+
_Java_org_duckdb_DuckDBBindings_duckdb_1init_1get_1column_1index
129+
_Java_org_duckdb_DuckDBBindings_duckdb_1init_1set_1max_1threads
130+
_Java_org_duckdb_DuckDBBindings_duckdb_1init_1set_1error
131+
_Java_org_duckdb_DuckDBBindings_duckdb_1function_1get_1bind_1data
132+
_Java_org_duckdb_DuckDBBindings_duckdb_1function_1get_1init_1data
133+
_Java_org_duckdb_DuckDBBindings_duckdb_1function_1get_1local_1init_1data
134+
_Java_org_duckdb_DuckDBBindings_duckdb_1function_1set_1error
98135

99136
_duckdb_adbc_init
100137
_duckdb_add_aggregate_function_to_set

0 commit comments

Comments
 (0)