Skip to content

Commit ca0b914

Browse files
Implement Phase 3: AttachType, XAttachType, FilepathType
Add remaining built-in AttributeTypes: - <attach>: Internal file attachment stored in longblob - <xattach>: External file attachment via <content> with deduplication - <filepath@store>: Reference to existing file (no copy, returns ObjectRef) Update implementation plan to mark Phase 3 complete. Co-authored-by: dimitri-yatsenko <dimitri@datajoint.com>
1 parent e1b3be1 commit ca0b914

File tree

2 files changed

+352
-18
lines changed

2 files changed

+352
-18
lines changed

docs/src/design/tables/storage-types-implementation-plan.md

Lines changed: 35 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ This plan describes the implementation of a three-layer type architecture for Da
1919
| Phase 1: Core Type System | ✅ Complete | CORE_TYPES dict, type chain resolution |
2020
| Phase 2: Content-Addressed Storage | ✅ Complete | Function-based, no registry table |
2121
| Phase 2b: Path-Addressed Storage | ✅ Complete | ObjectType for files/folders |
22-
| Phase 3: User-Defined AttributeTypes | 🔲 Pending | AttachType/FilepathType pending |
22+
| Phase 3: User-Defined AttributeTypes | ✅ Complete | AttachType, XAttachType, FilepathType |
2323
| Phase 4: Insert and Fetch Integration | ✅ Complete | Type chain encoding/decoding |
2424
| Phase 5: Garbage Collection | 🔲 Pending | |
2525
| Phase 6: Documentation and Testing | 🔲 Pending | |
@@ -227,14 +227,16 @@ Both produce the same JSON metadata format compatible with `ObjectRef.from_json(
227227

228228
---
229229

230-
## Phase 3: User-Defined AttributeTypes
230+
## Phase 3: User-Defined AttributeTypes
231231

232-
**Status**: Partially complete
232+
**Status**: Complete
233+
234+
All built-in AttributeTypes are implemented in `src/datajoint/builtin_types.py`.
233235

234236
### 3.1 XBlobType ✅
235-
Implemented as shown above. Composes with `<content>`.
237+
External serialized blobs using content-addressed storage. Composes with `<content>`.
236238

237-
### 3.2 AttachType and XAttachType 🔲
239+
### 3.2 AttachType
238240

239241
```python
240242
@register_type
@@ -243,41 +245,53 @@ class AttachType(AttributeType):
243245
type_name = "attach"
244246
dtype = "longblob"
245247

246-
def encode(self, filepath, *, key=None) -> bytes:
247-
path = Path(filepath)
248-
return path.name.encode() + b"\0" + path.read_bytes()
248+
def encode(self, filepath, *, key=None, store_name=None) -> bytes:
249+
# Returns: filename (UTF-8) + null byte + contents
250+
return path.name.encode("utf-8") + b"\x00" + path.read_bytes()
249251

250252
def decode(self, stored, *, key=None) -> str:
251-
filename, contents = stored.split(b"\0", 1)
252-
# Write to download_path and return path
253+
# Extracts to download_path, returns local path
253254
...
255+
```
256+
257+
### 3.3 XAttachType ✅
254258

259+
```python
255260
@register_type
256261
class XAttachType(AttributeType):
257262
"""External file attachment using content-addressed storage."""
258263
type_name = "xattach"
259-
dtype = "<content>"
260-
# Similar to AttachType but composes with content storage
264+
dtype = "<content>" # Composes with ContentType
265+
# Same encode/decode as AttachType, but stored externally with dedup
261266
```
262267

263-
### 3.3 FilepathType 🔲
268+
### 3.4 FilepathType
264269

265270
```python
266271
@register_type
267272
class FilepathType(AttributeType):
268-
"""Portable relative path reference within configured stores."""
273+
"""Reference to existing file in configured store."""
269274
type_name = "filepath"
270275
dtype = "json"
271276

272277
def encode(self, relative_path: str, *, key=None, store_name=None) -> dict:
273-
"""Register reference to file in store."""
274-
return {'path': relative_path, 'store': store_name}
278+
# Verifies file exists, returns metadata
279+
return {'path': path, 'store': store_name, 'size': size, ...}
275280

276281
def decode(self, stored: dict, *, key=None) -> ObjectRef:
277-
"""Return ObjectRef for lazy access."""
278-
return ObjectRef(store=stored['store'], path=stored['path'])
282+
# Returns ObjectRef for lazy access
283+
return ObjectRef.from_json(stored, backend=backend)
279284
```
280285

286+
### Type Comparison
287+
288+
| Type | Storage | Copies File | Dedup | Returns |
289+
|------|---------|-------------|-------|---------|
290+
| `<attach>` | Database | Yes | No | Local path |
291+
| `<xattach>` | External | Yes | Yes | Local path |
292+
| `<filepath>` | Reference | No | N/A | ObjectRef |
293+
| `<object>` | External | Yes | No | ObjectRef |
294+
281295
---
282296

283297
## Phase 4: Insert and Fetch Integration ✅
@@ -433,9 +447,12 @@ Layer 1: Native Database Types
433447
**Built-in AttributeTypes:**
434448
```
435449
<djblob> → longblob (internal serialized storage)
450+
<attach> → longblob (internal file attachment)
436451
<object> → json (path-addressed, for Zarr/HDF5/folders)
452+
<filepath> → json (reference to existing file in store)
437453
<content> → json (content-addressed with deduplication)
438454
<xblob> → <content> → json (external serialized with dedup)
455+
<xattach> → <content> → json (external file attachment with dedup)
439456
```
440457

441458
**Type Composition Example:**

0 commit comments

Comments
 (0)