Skip to content

Releases: NeuroJSON/bjdata

BJData Specification - Draft-4

09 Apr 04:39

Choose a tag to compare

image

BJData Specification — Draft 4 Release Notes

Release date: April 2026
Previous release: Draft 3 (March 2025)
Specification: Binary_JData_Specification.md


Overview

Draft 4 is the fourth stable release of the Binary JData (BJData) specification.
It introduces two major new features — Structure-of-Arrays (SoA) containers and
a typed Extension (E) mechanism — along with a machine-readable JSON Schema,
a new Dart parser library, and command-line viewer utilities.

Draft 4 is backward compatible with Draft 3. All valid Draft-3 files are valid
Draft-4 files. Parsers that do not implement the new features can safely skip unknown
markers or treat them as opaque binary data.


New Features

1. Structure-of-Arrays (SoA) container

SoA is a new class of optimized container that stores arrays of structured records
with a shared inline schema. It enables highly compact and efficient serialization
of tabular data (e.g., Pandas DataFrames, NumPy arrays-of-structs, simulation particle
data) without any per-record type redundancy.

Container syntax

[[$] [{schema}] [#] <count>  <payload>    // row-major  (array of records)
[{$] [{schema}] [#] <count>  <payload>    // column-major (object of arrays)
  • Row-major ([$) — records are stored sequentially; each record's fields appear
    together (interleaved layout, cache-friendly for per-record access).
  • Column-major ({$) — all values of each field are stored together (columnar
    layout, cache-friendly for per-field access and vectorized operations).

Both forms support N-dimensional counts using the same #[...] syntax as ND arrays.

Schema definition

The schema is a payload-less object — field names followed by type markers only,
with no data values. Supported field types include:

Category Markers
Fixed-length numeric U i u I l m L M h d D C B
Boolean T (stored as 1 byte T/F per record)
Null / placeholder Z (zero bytes in payload; reserved field)
Fixed-length string S <int-type> <length>
Dictionary string [$S#<n><str1>... (index per record)
Offset-table string [$<offset-type>] (appended offset table + buffer)
Fixed-length high-precision H <int-type> <length>
Nested object {...} (all fields must be supported types)
Fixed array [type type ...] (explicit per-element types)

Variable-length string storage

Three modes are provided to cover all cardinality patterns:

  • Fixed-length (S <int> <len>) — best for short codes and identifiers of known
    maximum length; no per-record length prefix.
  • Dictionary-based ([$S#<n>...) — best for categorical/low-cardinality strings;
    each record stores only a small integer index into an embedded string table.
  • Offset-table-based ([$<type>]) — best for diverse free-text strings of highly
    variable length; strings are concatenated into a buffer appended after the payload,
    with an offset table for O(1) random access.

2. Extension type (E)

The new E marker provides a general-purpose mechanism for embedding typed binary
data that is not natively representable in the core BJData type system.

Format

[E][id-type][id-value][len-type][len-value][payload]
  • id-value in range 0–255: reserved types defined by this specification (all
    use fixed payload sizes for unambiguous parsing).
  • id-value 256 and above: application-defined types; meaning is determined by
    the application and should be documented separately.

Reserved extension types (0–255)

ID Name Payload Description
0 Reserved for future use
1 epoch_s 4 bytes Unix timestamp in seconds (uint32)
2 epoch_us 8 bytes Unix timestamp in microseconds (int64)
3 epoch_ns 12 bytes Unix timestamp in nanoseconds (int64 + uint32)
4 date 4 bytes Calendar date: year (int16) + month + day (uint8 each)
5 time_s 4 bytes Time of day in seconds: hour, minute, second, reserved
6 datetime_us 8 bytes Date and time in microseconds since epoch (int64)
7 timedelta_us 8 bytes Duration in microseconds (int64)
8 complex64 8 bytes Complex number: real + imaginary (float32 each)
9 complex128 16 bytes Complex number: real + imaginary (float64 each)
10 uuid 16 bytes UUID per RFC 4122 (Big-Endian, exception to BJData convention)
11–255 Reserved for future specification

All numeric fields in extension payloads use Little-Endian byte order, except UUID
which follows RFC 4122 network byte order.


Additional Changes

JSON Schema

A formal JSON Schema (schema/bjdata_format_schema.json) is now included in the
repository. It describes the logical structure of BJData documents, covering all
value types, container variants (including optimized 1D and ND arrays, SoA row- and
column-major), and the new Extension type.

Dart parser library

A Dart implementation of the BJData parser
(dart-bjdata) has been added to the list
of supported libraries.

Command-line viewer utilities

Two minimal command-line viewers, njv (Perl) and njv.py (Python), are included
for quickly inspecting the content of .bjd files without a full parsing library.


Specification Corrections

The following errors in the Draft-3 specification text were corrected in Draft 4:

  • Numeric type table: the "Signed" column was inverted for all types from uint8
    onwards (e.g., int16 was shown as unsigned, uint16 as signed).
  • d marker in SoA examples: all SoA section examples used d (float32) where
    D (float64) was intended, including mismatched byte offsets in the detailed
    encoding table.
  • Dimension variable typo: Nz/Ny/Nz/Ndim referenced in the ND array section
    was corrected to Nx/Ny/Nz/Ndim.
  • Invalid JSON in examples: trailing and missing commas in the char and byte
    value examples were corrected.
  • Float description: "fraction (significant)" corrected to "fraction (significand)"
    in the float16/32/64 structure descriptions.

Upgrading from Draft 3

No changes to the Draft-3 binary encoding are required. Draft-4 adds two new
constructs — SoA containers and the E marker — that are simply absent from
Draft-3 files:

  • A Draft-3 parser reading a Draft-4 file will encounter E markers or
    SoA containers only if the file was written by a Draft-4 encoder. Parsers
    that treat unknown markers as opaque data or raise a controlled error will
    continue to operate correctly on all other content in the file.
  • A Draft-4 encoder writing for Draft-3 consumers should avoid using the new
    E marker and SoA constructs until consumers are updated.

Acknowledgement

This specification was developed as part of the NeuroJSON project
(https://neurojson.org), with funding support from the US National Institute of
Health (NIH) under grant U24-NS124027.


ChangeLog

2026-04-08 [0244bf4][feat] add njv and njv.py to rapidly display bjd files in Perl/Python
2026-04-08 [04703ec] [release] update language and schema for Draft-4
2026-01-10 [7e9ccd4] Merge pull request #23 from nebkat/lib/dart
2026-01-10 [2c85e13] Merge branch 'master' into lib/dart
2026-01-10 [c4ca869] [doc] highlight uniformity of data in ND packed array and SoA payload
2026-01-10 [edf9d2d] [ci] fix additional spellchecker errors
2026-01-10 [874e662] [ci] fix spellcheck errors
2026-01-10 [b68625a] Update .wordlist.txt
2026-01-10 [2492c0a]
[feat] add support for extension data type E marker
2026-01-10 [7486ab8] [doc] update introduction, before adding Extension type
2026-01-05 [c2ac37e] [feat] extension to support variable length strings in SoA
2026-01-04 [3da7097] Update .wordlist.txt
2026-01-04 [044497b] [typo] fix typos, fix TOC
2026-01-03 [71c3668][Feat] initial draft of structure-of-arrays using object schema
2025-12-02 [b82ef0d] Bump actions/checkout from 5 to 6
2025-10-05 [55ab6da] [lib] Add Dart parser
2025-08-24 [797b314]
[schema] add json schema for bjdata binary json format
2025-03-23 [54a8db7] [lib] update all parser submodules
2025-03-23 [f937ad4] [doc] update README links

BJData Specification - Draft 3

24 Mar 04:05

Choose a tag to compare

image

Binary JData: A portable interchange format for complex binary data

ChangeLog

2025-02-22 [14b23ac] Update README.md
2025-02-22 [1b718a7] Set h background to match float type
2025-02-22 [1c7f954] Update BJData synopsis
2025-02-22 [51830fd] Provide column-major optimized ND array example
2025-02-22 [085d0a4] Update .wordlist.txt
2025-02-22*[a9683d9] Optimized array supports column-major storage
2025-02-22 [7adc621] Update Binary_JData_Specification.md
2024-11-24 [0844096] fix: Byte value example
2024-11-24 [94c5ac7] add "parsers" to wordlist
2024-10-25 [548cc29] Added basic Dependabot configuration
2024-05-10*[f9cd4e3] feat: Binary value type
2024-01-22 [883d1d6] add new word to wordlist
2024-01-22 [e3b9e7b] change urls to https
2024-01-22 [1f4f7fd] add missing config file
2024-01-22 [244422a] add missing config file
2024-01-22 [a8f45ba] add spell checking action
2022-06-09 [d791c73] add 1-page synosis, update all parsers to the latest git versions

BJData Specification - Draft 2

24 Mar 03:39

Choose a tag to compare

image

Binary JData: A portable interchange format for complex binary data

ChangeLog

2022-03-02 [98e0b52] update RFC links in README for Draft 2
2022-03-02*[90150eb] Disable [{SHTFNZ as optimized type marker due to security risks and diminished benefit
2022-02-22 [996ecb8] Update README.md
2022-02-21 [fc51791] Update Binary_JData_Specification.md
2022-02-21 [ecaf84c] update stable release link to Draft 2
2022-02-21 [83bc0a3] remote outdated module path
2022-02-21 [9231620] add c++ library
2022-02-21 [2429e30] update draft-2 compatible libraries
2022-02-21 [20390d1] Update README.md
2022-02-21 [fb7f438] Polish specification for Draft 2 freeze
2022-02-21 [c78697d] Update README.md
2022-02-21 [f9f872b] Update README.md
2022-02-19 [9edaa9f] fix typo
2022-01-30*[19f4b5c] Breaking: BJData Draft-2 now uses Little-Endian as default int/float byte order
2020-08-10 [d322683] update python modules
2020-07-13 [cb3c5ff] update fully functional python and matlab modules
2020-06-07 [43cb6ea] Update version number
2020-05-16 [8352986] update jsonlab and zmat to the latest

BJData Specification - Draft-1

24 Mar 03:37

Choose a tag to compare

image

Binary JData: A portable interchange format for complex binary data

  • Status of this document: Request for comments.
  • Maintainer: Qianqian Fang <q.fang at neu.edu>
  • License: Apache License, Version 2.0
  • Version: 1 (Draft-1)
  • URL: https://neurojson.org/bjdata/draft1

ChangeLog

2020-05-12 [2da127d] update RFC commit link for Draft 1 in README
2020-05-12 [8a6b1a0] update RFC commit link for Draft 1 in README
2020-05-12 [8710a15] change document status to request for comments
2020-05-12 [5d3d88f] Minor language updates
2020-05-12 [8722606] Language proofreading, thanks to Edward Xu (@edwardx324)
2020-05-11 [99fbdb9] Minor language polishing
2020-05-11 [a067ed3] Add introduction
2020-05-08 [a418aaf] add Py-JData module
2020-05-08 [ec2fcf7] add initial python support to BJData draft 1
2020-05-08 [3ae7a1f] sync jsonlab to the latest version
2020-05-06 [a73d201] sync with jsonlab fix
2020-05-06 [214a590] update to the latest jsonlab which supports bjd draft 1
2020-05-04 [ba2726b] Add project logo in README
2020-05-04 [4e25de0] Minor formatting update
2020-05-04 [0cac4cb] Polishing format and language for clarity
2020-05-04 [9f592e8] add BJData compliant parsers
2020-05-04 [92de0c9] H marker should be signed
2020-05-04 [f3d338f] Use IEEE 754 NaN and Infinity binary forms instead of converting to null
2020-05-04 [e98f607] fix formatting of 3D array example
2020-05-04 [95544dc] Add uint16 (u), uint32 (m), uint64 (M), half (h) markers and ND array optimized syntax
2020-05-04 [16c00e2] Update TOC links
2020-05-04 [fdf43b1] Add title, abstract, and acknowledgement
2020-05-03 [7b559a5] renaming document to binary jdata spec
2020-05-03 [582e078] Add README for initial import
2020-05-03 [119d0bd] Initial import of Markup version of UBJSON Spec from https://github.com/Iotic-Labs/py-ubjson