Architecture Overview
Dits is built as a layered system with a core engine handling content management, transport layer for network operations, and client interfaces for different use cases.
High-Level Architecture
Client Layer
CLI
(clap)
GUI
(Tauri)
SDK
(Rust)
VFS
(FUSE)
NLE Plugins
(Premiere/Resolve)
Core Engine
Hybrid Storage (libgit2 + FastCDC)
BLAKE3 Hashing
ISOBMFF Parsing
Manifest Management
Conflict Resolution
Transport Layer
QUIC (quinn)
Delta Sync
Bandwidth Estimation
Resume/Retry
LOCAL
(.dits/)
SERVER
(Axum)
PostgreSQL + Redis
STORAGE
(S3)
Hybrid Storage System
Dits uses a hybrid storage model that intelligently routes files to the appropriate storage engine based on their type:
File In
Classify
(by type)
Text File
(.md, .json)
Binary File
(.mp4, .mov)
Hybrid
(.prproj)
libgit2
Diff
Merge
Blame
FastCDC
Chunk
Dedup
Delta
Git + CDC
combined
Core Components
Content Store
The content-addressable storage layer handles chunking, hashing, deduplication, and compression of all repository data.
Learn about data structures →Object Model
Commits, trees, assets, and chunks form a directed acyclic graph (DAG) that represents repository history.
Learn about objects →Reference System
Branches, tags, and HEAD provide named pointers into the commit graph, enabling version navigation.
Learn about branches →Transport Protocol
QUIC-based protocol for efficient chunk transfer with delta synchronization and resumable uploads.
Learn about protocol →The Chunking Pipeline
Binary and media files pass through Dits' content-defined chunking pipeline for efficient storage and deduplication:
Chunking Pipeline
Binary File
FastCDC Chunker
BLAKE3 Hash
Content-Addressable Store
Binary File
FastCDC Chunker
BLAKE3 Hash
Content-Addressable Store
Deduplication Results
Move file A→B
0 bytes
(hashes match)
100% saved
Trim video start
~5% of file
(only start chunks change)
95% saved
Append to file
Size of append only
(existing chunks reused)
varies saved
Design Principles
Content-Addressable Storage
All data in Dits is identified by its cryptographic hash (BLAKE3). This provides:
- Automatic deduplication: Identical content is stored once
- Data integrity: Corruption is immediately detectable
- Immutability: Content cannot be changed without changing its address
- Parallel verification: Multiple sources can be verified independently
Format-Aware Processing
Unlike generic version control, Dits understands the structure of media files:
- Container parsing: MP4/MOV atoms are preserved intact
- Keyframe alignment: Chunks align to video I-frames
- Metadata extraction: Duration, codec, resolution indexed
- Temporal awareness: Changes tracked by timecode, not just bytes
Efficient Synchronization
The transport layer minimizes data transfer:
- Delta sync: Only missing chunks are transferred
- Parallel streams: Multiple chunks transfer simultaneously
- Resumable: Interrupted transfers continue where they stopped
- Bandwidth adaptive: Adjusts to network conditions
Crate Structure
Dits is organized into several Rust crates:
Crate
Purpose
dits-core/
Chunking, hashing, manifests, object model
dits-parsers/
ISOBMFF, NLE project file parsing
dits-storage/
Local and remote storage backends
dits-protocol/
Wire protocol, serialization
dits-client/
CLI implementation
dits-server/
REST API and QUIC server
dits-vfs/
FUSE virtual filesystem
dits-sdk/
Public Rust SDK
Data Flow
Adding a File
Adding a File
1
Read
From filesystem
2
Parse
Container format
3
Chunk
FastCDC + keyframes
4
Hash
BLAKE3
5
Dedup
Check existing
6
Compress
New chunks
7
Store
.dits/objects/
8
Manifest
Create asset
9
Stage
Update index
1
Read file from filesystem
2
Parse container format
If video file
3
Chunk using FastCDC
With keyframe alignment
4
Hash each chunk
BLAKE3
5
Check for existing chunks
Deduplication
6
Compress new chunks
7
Store in .dits/objects/chunks/
8
Create asset manifest
9
Update staging index
Pushing Changes
Pushing Changes
1
Enumerate commits to push
2
Get tree and asset hashes
3
Query remote for existing chunks
4
Upload only missing chunks (delta)
5
Upload manifests and commits
6
Update remote references
Cloning a Repository
Cloning a Repository
1
Fetch remote refs (branches, tags)
2
Fetch commit graph
3
Fetch tree manifests
4
Fetch asset manifests
5
(Sparse) Mark required chunks
6
Fetch and verify chunks
7
Reconstruct working directory
Security Model
- Authentication: JWT tokens with refresh, MFA support
- Authorization: Repository-level permissions (admin, write, read)
- Encryption at rest: Optional AES-256-GCM chunk encryption
- Encryption in transit: TLS 1.3 for all connections
- Integrity: All content verified by BLAKE3 hash
Detailed Documentation
- Data Structures - Chunks, assets, commits, and manifests
- Algorithms - FastCDC, BLAKE3, keyframe alignment
- Network Protocol - QUIC transport, delta sync, API
- Security - Authentication, encryption, and access control