Skip to content

Commit 4fa802a

Browse files
Document AttributeType naming conventions comprehensively
- Reorganize "Special DataJoint-only datatypes" as "AttributeTypes" - Add naming convention explanation (dj prefix, x prefix, @store suffix) - List all built-in AttributeTypes with categories: - Serialization types: <djblob>, <xblob> - File storage types: <object>, <content> - File attachment types: <attach>, <xattach> - File reference types: <filepath> - Fix inconsistent angle bracket notation throughout docs - Update example to use int32 core type and include <djblob> - Expand naming conventions in Key Design Decisions section Co-authored-by: dimitri-yatsenko <dimitri@datajoint.com>
1 parent e55c9a7 commit 4fa802a

File tree

2 files changed

+50
-21
lines changed

2 files changed

+50
-21
lines changed

docs/src/design/tables/attributes.md

Lines changed: 44 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -76,27 +76,55 @@ for portable pipelines. Using native types will generate a warning.
7676

7777
See the [storage types spec](storage-types-spec.md) for complete mappings.
7878

79-
## Special DataJoint-only datatypes
79+
## AttributeTypes (special datatypes)
8080

81-
These types abstract certain kinds of non-database data to facilitate use
82-
together with DataJoint.
81+
AttributeTypes provide `encode()`/`decode()` semantics for complex data that doesn't
82+
fit native database types. They are denoted with angle brackets: `<type_name>`.
83+
84+
### Naming conventions
85+
86+
- **`dj` prefix**: DataJoint-specific internal serialization (`<djblob>`)
87+
- **`x` prefix**: External/content-addressed variant (`<xblob>`, `<xattach>`)
88+
- **`@store` suffix**: Specifies which configured store to use
89+
90+
### Built-in AttributeTypes
91+
92+
**Serialization types** - for Python objects:
8393

8494
- `<djblob>`: DataJoint's native serialization format for Python objects. Supports
85-
NumPy arrays, dicts, lists, datetime objects, and nested structures. Compatible with
86-
MATLAB. See [custom types](customtype.md) for details.
95+
NumPy arrays, dicts, lists, datetime objects, and nested structures. Stores in
96+
database. Compatible with MATLAB. See [custom types](customtype.md) for details.
97+
98+
- `<xblob>` / `<xblob@store>`: Like `<djblob>` but stores externally with content-
99+
addressed deduplication. Use for large arrays that may be duplicated across rows.
100+
101+
**File storage types** - for managed files:
102+
103+
- `<object>` / `<object@store>`: Managed file and folder storage with path derived
104+
from primary key. Supports Zarr, HDF5, and direct writes via fsspec. Returns
105+
`ObjectRef` for lazy access. See [object storage](object.md).
106+
107+
- `<content>` / `<content@store>`: Content-addressed storage for raw bytes with
108+
SHA256 deduplication. Use via `<xblob>` or `<xattach>` rather than directly.
109+
110+
**File attachment types** - for file transfer:
111+
112+
- `<attach>`: File attachment stored in database with filename preserved. Similar
113+
to email attachments. Good for small files (<16MB). See [attachments](attach.md).
114+
115+
- `<xattach>` / `<xattach@store>`: Like `<attach>` but stores externally with
116+
deduplication. Use for large files.
87117

88-
- `object`: managed [file and folder storage](object.md) with support for direct writes
89-
(Zarr, HDF5) and fsspec integration. Recommended for new pipelines.
118+
**File reference types** - for external files:
90119

91-
- `attach`: a [file attachment](attach.md) similar to email attachments facillitating
92-
sending/receiving an opaque data file to/from a DataJoint pipeline.
120+
- `<filepath@store>`: Reference to existing file in a configured store. No file
121+
copying occurs. Returns `ObjectRef` for lazy access. See [filepath](filepath.md).
93122

94-
- `filepath@store`: a [filepath](filepath.md) used to link non-DataJoint managed files
95-
into a DataJoint pipeline.
123+
### User-defined AttributeTypes
96124

97-
- `<custom_type>`: a [custom attribute type](customtype.md) that defines bidirectional
98-
conversion between Python objects and database storage formats. Use this to store
99-
complex data types like graphs, domain-specific objects, or custom data structures.
125+
- `<custom_type>`: Define your own [custom attribute type](customtype.md) with
126+
bidirectional conversion between Python objects and database storage. Use for
127+
graphs, domain-specific objects, or custom data structures.
100128

101129
## Core type aliases
102130

@@ -125,14 +153,15 @@ Example usage:
125153
@schema
126154
class Measurement(dj.Manual):
127155
definition = """
128-
measurement_id : int
156+
measurement_id : int32
129157
---
130158
temperature : float32 # single-precision temperature reading
131159
precise_value : float64 # double-precision measurement
132160
sample_count : uint32 # unsigned 32-bit counter
133161
sensor_flags : uint8 # 8-bit status flags
134162
is_valid : bool # boolean flag
135163
raw_data : bytes # raw binary data
164+
processed : <djblob> # serialized Python object
136165
"""
137166
```
138167

docs/src/design/tables/storage-types-spec.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -652,11 +652,11 @@ def garbage_collect(project):
652652
8. **No `uri` type**: For arbitrary URLs, use `varchar`—simpler and more transparent
653653
9. **Content type**: Single-blob, content-addressed, deduplicated storage
654654
10. **Parameterized types**: `<type@param>` passes store parameter
655-
11. **Naming convention**:
656-
- `<djblob>` = internal serialized (database)
657-
- `<xblob>` = external serialized (content-addressed)
658-
- `<attach>` = internal file (single file)
659-
- `<xattach>` = external file (single file)
655+
11. **Naming conventions**:
656+
- `dj` prefix = DataJoint-specific internal serialization (`<djblob>`)
657+
- `x` prefix = external/content-addressed variant (`<xblob>`, `<xattach>`)
658+
- `@store` suffix = specifies which configured store to use
659+
- Types without prefix: core storage mechanisms (`<object>`, `<content>`, `<attach>`, `<filepath>`)
660660
12. **Transparent access**: AttributeTypes return Python objects or file paths
661661
13. **Lazy access**: `<object>`, `<object@store>`, and `<filepath@store>` return ObjectRef
662662

@@ -668,7 +668,7 @@ def garbage_collect(project):
668668
| `blob@store` | `<xblob@store>` |
669669
| `attach` | `<attach>` |
670670
| `attach@store` | `<xattach@store>` |
671-
| `filepath@store` (copy-based) | `filepath@store` (ObjectRef-based, upgraded) |
671+
| `filepath@store` (copy-based) | `<filepath@store>` (ObjectRef-based, upgraded) |
672672

673673
### Migration from Legacy `~external_*` Stores
674674

0 commit comments

Comments
 (0)