WARNING: THIS SITE IS A MIRROR OF GITHUB.COM / IT CANNOT LOGIN OR REGISTER ACCOUNTS / THE CONTENTS ARE PROVIDED AS-IS / THIS SITE ASSUMES NO RESPONSIBILITY FOR ANY DISPLAYED CONTENT OR LINKS / IF YOU FOUND SOMETHING MAY NOT GOOD FOR EVERYONE, CONTACT ADMIN AT ilovescratch@foxmail.com
Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
64 commits
Select commit Hold shift + click to select a range
2be5f11
Introduce AttributeType system to replace AttributeAdapter
claude Dec 21, 2025
055c9c6
Update documentation for new AttributeType system
claude Dec 21, 2025
af9bd8d
Apply ruff-format fixes to AttributeType implementation
claude Dec 21, 2025
9bd37f6
Add DJBlobType and migration utilities for blob columns
claude Dec 21, 2025
c8d8a22
Clarify migration handles all blob type variants
claude Dec 21, 2025
61db015
Fix ruff linter errors: add migrate to __all__, remove unused import
claude Dec 21, 2025
78e0d1d
Remove serializes flag; longblob is now raw bytes
claude Dec 21, 2025
c173356
Remove unused blob imports from fetch.py and table.py
claude Dec 21, 2025
106f859
Update docs: use <djblob> for serialized data, longblob for raw bytes
claude Dec 21, 2025
e293fec
Merge branch 'claude/add-file-column-type-LtXQt' into claude/upgrade-…
dimitri-yatsenko Dec 21, 2025
d66f76e
Merge claude/add-file-column-type-LtXQt into upgrade-adapted-type
claude Dec 22, 2025
2f9b2be
Merge remote-tracking branch 'origin/claude/upgrade-adapted-type-1W3a…
claude Dec 22, 2025
1d4b7b4
Merge remote-tracking branch 'origin/pre/v2.0' into claude/upgrade-ad…
claude Dec 22, 2025
8091225
Merge remote-tracking branch 'origin/claude/add-file-column-type-LtXQ…
claude Dec 23, 2025
c34a5b8
Merge remote-tracking branch 'origin/claude/add-file-column-type-LtXQ…
claude Dec 24, 2025
cab10f6
Add storage types redesign spec
claude Dec 25, 2025
261543f
Merge remote-tracking branch 'origin/pre/v2.0' into claude/upgrade-ad…
claude Dec 25, 2025
ecac82d
Update storage types spec with OAS integration approach
claude Dec 25, 2025
7e7f968
Unify external storage under OAS with content-addressed region
claude Dec 25, 2025
495d7f7
Make <djblob@store> and <attach@store> return values transparently
claude Dec 25, 2025
7ae8f15
Introduce layered storage architecture with content core type
claude Dec 25, 2025
6fcc4d3
Add parameterized AttributeTypes and content vs object comparison
claude Dec 25, 2025
b87342b
Make content storage per-project and add migration utility
claude Dec 25, 2025
40c1dbb
Add filepath as third OAS region with ObjectRef interface
claude Dec 25, 2025
dbf092d
Redesign filepath as URI reference tracker and add json core type
claude Dec 25, 2025
43c1999
Simplify filepath to filepath@store with relative paths for portability
claude Dec 25, 2025
b9b6e34
Simplify to two-layer architecture: database types + AttributeTypes
claude Dec 25, 2025
2a5d161
Add three-layer type architecture with core DataJoint types
claude Dec 25, 2025
d36739d
Use angle brackets for all AttributeTypes in definitions
claude Dec 25, 2025
5c1e854
Add implementation plan for storage types redesign
claude Dec 25, 2025
979f45b
Implement Phase 1: Core type system with store parameter support
claude Dec 25, 2025
6926c58
Remove legacy AttributeAdapter support, update tests for AttributeType
claude Dec 25, 2025
97bc162
Simplify core type system: remove SERIALIZED_TYPES, clarify blob sema…
claude Dec 25, 2025
2de222a
Simplify type system: only core types and AttributeTypes
claude Dec 25, 2025
f35e027
Define complete core type system with blob→longblob mapping
claude Dec 25, 2025
746108a
Implement Phase 2: Content-Addressed Storage
claude Dec 25, 2025
328a59a
Apply ruff-format to content_registry.py
claude Dec 25, 2025
bbbfbc3
Remove legacy compatibility shims: attribute_adapter.py, bypass_seria…
claude Dec 25, 2025
3c4608f
Update implementation plan to reflect actual implementation
claude Dec 25, 2025
70fb567
Move built-in AttributeTypes to separate builtin_types.py module
claude Dec 25, 2025
ad09877
Implement ObjectType for path-addressed storage
claude Dec 25, 2025
dd8c623
Remove migration phase from implementation plan
claude Dec 25, 2025
e1b3be1
Add staged insert documentation to implementation plan
claude Dec 25, 2025
ca0b914
Implement Phase 3: AttachType, XAttachType, FilepathType
claude Dec 25, 2025
d0f5614
Implement Phase 5 (GC) and Phase 6 (Tests)
claude Dec 25, 2025
73535de
Add object type garbage collection support
claude Dec 26, 2025
3fc00ee
Move EXTERNAL_TABLE_ROOT to external.py (deprecated)
claude Dec 26, 2025
b4512c9
Remove deprecated external.py module
claude Dec 26, 2025
c951ee5
Replace ClassProperty with metaclass properties
claude Dec 30, 2025
6d5b745
Simplify test infrastructure to use docker-compose services
dimitri-yatsenko Dec 31, 2025
4edf5ed
Fix table_name and uuid type resolution bugs
dimitri-yatsenko Dec 31, 2025
5aa191f
Use <djblob> for automatic serialization, fix is_blob detection
dimitri-yatsenko Dec 31, 2025
8140530
Fix settings tests and config loading
dimitri-yatsenko Dec 31, 2025
09d1f1d
Fix adapted_attributes tests for new type system
dimitri-yatsenko Dec 31, 2025
d51c16e
Fix test failures and update to new type system
dimitri-yatsenko Dec 31, 2025
7d32ea1
Fix object type and remove legacy external tables
dimitri-yatsenko Jan 1, 2026
ef6c66d
Remove legacy log table and bump version to 2.0.0a1
dimitri-yatsenko Jan 1, 2026
0107d8d
Fix Table.describe() to show core types instead of native types
dimitri-yatsenko Jan 1, 2026
c486936
Fix config precedence: environment variables now override config files
dimitri-yatsenko Jan 1, 2026
27a3778
Fix test compatibility and remove deprecated s3.py
dimitri-yatsenko Jan 1, 2026
d956288
Remove dead code from Table class
dimitri-yatsenko Jan 1, 2026
c96581e
Simplify test setup and reorganize test structure
dimitri-yatsenko Jan 1, 2026
a82128d
Fix pyparsing deprecation warnings
dimitri-yatsenko Jan 1, 2026
08ed432
Fix pydantic model_fields deprecation warning
dimitri-yatsenko Jan 1, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 4 additions & 3 deletions .devcontainer/Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,7 @@ RUN \
pip uninstall datajoint -y

USER root
ENV DJ_HOST db
ENV DJ_USER root
ENV DJ_PASS password
ENV DJ_HOST=db
ENV DJ_USER=root
ENV DJ_PASS=password
ENV S3_ENDPOINT=minio:9000
9 changes: 4 additions & 5 deletions .devcontainer/devcontainer.json
Original file line number Diff line number Diff line change
@@ -1,7 +1,6 @@
{
"image": "mcr.microsoft.com/devcontainers/typescript-node:0-18",
"features": {
"ghcr.io/devcontainers/features/docker-in-docker:2": {}
},
"dockerComposeFile": ["../docker-compose.yaml", "docker-compose.yml"],
"service": "app",
"workspaceFolder": "/src",
"postCreateCommand": "curl -fsSL https://pixi.sh/install.sh | bash && echo 'export PATH=\"$HOME/.pixi/bin:$PATH\"' >> ~/.bashrc"
}
}
24 changes: 4 additions & 20 deletions .devcontainer/docker-compose.yml
Original file line number Diff line number Diff line change
@@ -1,30 +1,14 @@
# Devcontainer overrides for the app service from ../docker-compose.yaml
# Inherits db and minio services automatically
services:
# Update this to the name of the service you want to work with in your docker-compose.yml file
app:
# Uncomment if you want to override the service's Dockerfile to one in the .devcontainer
# folder. Note that the path of the Dockerfile and context is relative to the *primary*
# docker-compose.yml file (the first in the devcontainer.json "dockerComposeFile"
# array). The sample below assumes your primary file is in the root of your project.
container_name: datajoint-python-devcontainer
image: datajoint/datajoint-python-devcontainer:${PY_VER:-3.11}-${DISTRO:-bookworm}
build:
context: .
context: ..
dockerfile: .devcontainer/Dockerfile
args:
- PY_VER=${PY_VER:-3.11}
- DISTRO=${DISTRO:-bookworm}

volumes:
# Update this to wherever you want VS Code to mount the folder of your project
- ..:/workspaces:cached

# Uncomment the next four lines if you will use a ptrace-based debugger like C++, Go, and Rust.
# cap_add:
# - SYS_PTRACE
# security_opt:
# - seccomp:unconfined

user: root

# Overrides default command so things don't shut down after the process ends.
# Keep container running for devcontainer
command: /bin/sh -c "while sleep 1000; do :; done"
1 change: 1 addition & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -187,3 +187,4 @@ dj_local_conf.json
!.vscode/launch.json
# pixi environments
.pixi
_content/
70 changes: 70 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -141,3 +141,73 @@ DataJoint (<https://datajoint.com>).
- [Contribution Guidelines](https://docs.datajoint.com/about/contribute/)

- [Developer Guide](https://docs.datajoint.com/core/datajoint-python/latest/develop/)

## Developer Guide

### Prerequisites

- [Docker](https://docs.docker.com/get-docker/) for MySQL and MinIO services
- Python 3.10+

### Running Tests

Tests are organized into `unit/` (no external services) and `integration/` (requires MySQL + MinIO):

```bash
# Install dependencies
pip install -e ".[test]"

# Run unit tests only (fast, no Docker needed)
pytest tests/unit/

# Start MySQL and MinIO for integration tests
docker compose up -d db minio

# Run all tests
pytest tests/

# Run specific test file
pytest tests/integration/test_blob.py -v

# Stop services when done
docker compose down
```

### Alternative: Full Docker

Run tests entirely in Docker (no local Python needed):

```bash
docker compose --profile test up djtest --build
```

### Alternative: Using pixi

[pixi](https://pixi.sh) users can run tests with automatic service management:

```bash
pixi install # First time setup
pixi run test # Starts services and runs tests
pixi run services-down # Stop services
```

### Pre-commit Hooks

```bash
pre-commit install # Install hooks (first time)
pre-commit run --all-files # Run all checks
```

### Environment Variables

Tests use these defaults (configured in `pyproject.toml`):

| Variable | Default | Description |
|----------|---------|-------------|
| `DJ_HOST` | `localhost` | MySQL hostname |
| `DJ_PORT` | `3306` | MySQL port |
| `DJ_USER` | `root` | MySQL username |
| `DJ_PASS` | `password` | MySQL password |
| `S3_ENDPOINT` | `localhost:9000` | MinIO endpoint |

For Docker-based testing (devcontainer, djtest), set `DJ_HOST=db` and `S3_ENDPOINT=minio:9000`.
23 changes: 12 additions & 11 deletions docker-compose.yaml
Original file line number Diff line number Diff line change
@@ -1,15 +1,19 @@
# Development environment with MySQL and MinIO services
# To run tests: pytest --cov-report term-missing --cov=datajoint tests
#
# Quick start:
# docker compose up -d db minio # Start services
# pytest tests/ # Run tests (uses localhost defaults)
#
# Full Docker testing:
# docker compose --profile test up djtest --build
services:
db:
image: datajoint/mysql:${MYSQL_VER:-8.0}
environment:
- MYSQL_ROOT_PASSWORD=${DJ_PASS:-password}
command: mysqld --default-authentication-plugin=mysql_native_password
# ports:
# - "3306:3306"
# volumes:
# - ./mysql/data:/var/lib/mysql
ports:
- "3306:3306"
healthcheck:
test: [ "CMD", "mysqladmin", "ping", "-h", "localhost" ]
timeout: 30s
Expand All @@ -20,18 +24,15 @@ services:
environment:
- MINIO_ACCESS_KEY=datajoint
- MINIO_SECRET_KEY=datajoint
# ports:
# - "9000:9000"
# volumes:
# - ./minio/config:/root/.minio
# - ./minio/data:/data
ports:
- "9000:9000"
command: server --address ":9000" /data
healthcheck:
test:
- "CMD"
- "curl"
- "--fail"
- "http://minio:9000/minio/health/live"
- "http://localhost:9000/minio/health/live"
timeout: 30s
retries: 5
interval: 15s
Expand Down
2 changes: 1 addition & 1 deletion docs/src/compute/key-source.md
Original file line number Diff line number Diff line change
Expand Up @@ -45,7 +45,7 @@ definition = """
-> Recording
---
sample_rate : float
eeg_data : longblob
eeg_data : <djblob>
"""
key_source = Recording & 'recording_type = "EEG"'
```
4 changes: 2 additions & 2 deletions docs/src/compute/make.md
Original file line number Diff line number Diff line change
Expand Up @@ -152,7 +152,7 @@ class ImageAnalysis(dj.Computed):
# Complex image analysis results
-> Image
---
analysis_result : longblob
analysis_result : <djblob>
processing_time : float
"""

Expand Down Expand Up @@ -188,7 +188,7 @@ class ImageAnalysis(dj.Computed):
# Complex image analysis results
-> Image
---
analysis_result : longblob
analysis_result : <djblob>
processing_time : float
"""

Expand Down
6 changes: 3 additions & 3 deletions docs/src/compute/populate.md
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ class FilteredImage(dj.Computed):
# Filtered image
-> Image
---
filtered_image : longblob
filtered_image : <djblob>
"""

def make(self, key):
Expand Down Expand Up @@ -196,7 +196,7 @@ class ImageAnalysis(dj.Computed):
# Complex image analysis results
-> Image
---
analysis_result : longblob
analysis_result : <djblob>
processing_time : float
"""

Expand Down Expand Up @@ -230,7 +230,7 @@ class ImageAnalysis(dj.Computed):
# Complex image analysis results
-> Image
---
analysis_result : longblob
analysis_result : <djblob>
processing_time : float
"""

Expand Down
2 changes: 1 addition & 1 deletion docs/src/design/integrity.md
Original file line number Diff line number Diff line change
Expand Up @@ -142,7 +142,7 @@ definition = """
-> EEGRecording
channel_idx : int
---
channel_data : longblob
channel_data : <djblob>
"""
```
![doc_1-many](../images/doc_1-many.png){: style="align:center"}
Expand Down
13 changes: 11 additions & 2 deletions docs/src/design/tables/attributes.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,9 +48,10 @@ fractional digits.
Because of its well-defined precision, `decimal` values can be used in equality
comparison and be included in primary keys.

- `longblob`: arbitrary numeric array (e.g. matrix, image, structure), up to 4
- `longblob`: raw binary data, up to 4
[GiB](http://en.wikipedia.org/wiki/Gibibyte) in size.
Numeric arrays are compatible between MATLAB and Python (NumPy).
Stores and returns raw bytes without serialization.
For serialized Python objects (arrays, dicts, etc.), use `<djblob>` instead.
The `longblob` and other `blob` datatypes can be configured to store data
[externally](../../sysadmin/external-store.md) by using the `blob@store` syntax.

Expand All @@ -71,6 +72,10 @@ info).
These types abstract certain kinds of non-database data to facilitate use
together with DataJoint.

- `<djblob>`: DataJoint's native serialization format for Python objects. Supports
NumPy arrays, dicts, lists, datetime objects, and nested structures. Compatible with
MATLAB. See [custom types](customtype.md) for details.

- `object`: managed [file and folder storage](object.md) with support for direct writes
(Zarr, HDF5) and fsspec integration. Recommended for new pipelines.

Expand All @@ -80,6 +85,10 @@ sending/receiving an opaque data file to/from a DataJoint pipeline.
- `filepath@store`: a [filepath](filepath.md) used to link non-DataJoint managed files
into a DataJoint pipeline.

- `<custom_type>`: a [custom attribute type](customtype.md) that defines bidirectional
conversion between Python objects and database storage formats. Use this to store
complex data types like graphs, domain-specific objects, or custom data structures.

## Numeric type aliases

DataJoint provides convenient type aliases that map to standard MySQL numeric types.
Expand Down
Loading
Loading