Skip to main content
Documentation

Architecture Overview

Dits is built as a layered system with a core engine handling content management, transport layer for network operations, and client interfaces for different use cases.

High-Level Architecture

Client Layer
CLI
(clap)
GUI
(Tauri)
SDK
(Rust)
VFS
(FUSE)
NLE Plugins
(Premiere/Resolve)
Core Engine
Hybrid Storage (libgit2 + FastCDC)
BLAKE3 Hashing
ISOBMFF Parsing
Manifest Management
Conflict Resolution
Transport Layer
QUIC (quinn)
Delta Sync
Bandwidth Estimation
Resume/Retry
LOCAL
(.dits/)
SERVER
(Axum)
PostgreSQL + Redis
STORAGE
(S3)

Hybrid Storage System

Dits uses a hybrid storage model that intelligently routes files to the appropriate storage engine based on their type:

File In
Classify
(by type)
Text File
(.md, .json)
Binary File
(.mp4, .mov)
Hybrid
(.prproj)
libgit2
Diff
Merge
Blame
FastCDC
Chunk
Dedup
Delta
Git + CDC
combined

Core Components

Content Store
The content-addressable storage layer handles chunking, hashing, deduplication, and compression of all repository data.
Learn about data structures →
Object Model
Commits, trees, assets, and chunks form a directed acyclic graph (DAG) that represents repository history.
Learn about objects →
Reference System
Branches, tags, and HEAD provide named pointers into the commit graph, enabling version navigation.
Learn about branches →
Transport Protocol
QUIC-based protocol for efficient chunk transfer with delta synchronization and resumable uploads.
Learn about protocol →

The Chunking Pipeline

Binary and media files pass through Dits' content-defined chunking pipeline for efficient storage and deduplication:

Chunking Pipeline
Binary File
FastCDC Chunker
BLAKE3 Hash
Content-Addressable Store
Deduplication Results
Move file A→B
0 bytes
(hashes match)
100% saved
Trim video start
~5% of file
(only start chunks change)
95% saved
Append to file
Size of append only
(existing chunks reused)
varies saved

Design Principles

Content-Addressable Storage

All data in Dits is identified by its cryptographic hash (BLAKE3). This provides:

  • Automatic deduplication: Identical content is stored once
  • Data integrity: Corruption is immediately detectable
  • Immutability: Content cannot be changed without changing its address
  • Parallel verification: Multiple sources can be verified independently

Format-Aware Processing

Unlike generic version control, Dits understands the structure of media files:

  • Container parsing: MP4/MOV atoms are preserved intact
  • Keyframe alignment: Chunks align to video I-frames
  • Metadata extraction: Duration, codec, resolution indexed
  • Temporal awareness: Changes tracked by timecode, not just bytes

Efficient Synchronization

The transport layer minimizes data transfer:

  • Delta sync: Only missing chunks are transferred
  • Parallel streams: Multiple chunks transfer simultaneously
  • Resumable: Interrupted transfers continue where they stopped
  • Bandwidth adaptive: Adjusts to network conditions

Crate Structure

Dits is organized into several Rust crates:

Crate
Purpose
dits-core/
Chunking, hashing, manifests, object model
dits-parsers/
ISOBMFF, NLE project file parsing
dits-storage/
Local and remote storage backends
dits-protocol/
Wire protocol, serialization
dits-client/
CLI implementation
dits-server/
REST API and QUIC server
dits-vfs/
FUSE virtual filesystem
dits-sdk/
Public Rust SDK

Data Flow

Adding a File

Adding a File
1
Read file from filesystem
2
Parse container format
If video file
3
Chunk using FastCDC
With keyframe alignment
4
Hash each chunk
BLAKE3
5
Check for existing chunks
Deduplication
6
Compress new chunks
7
Store in .dits/objects/chunks/
8
Create asset manifest
9
Update staging index

Pushing Changes

Pushing Changes
1
Enumerate commits to push
2
Get tree and asset hashes
3
Query remote for existing chunks
4
Upload only missing chunks (delta)
5
Upload manifests and commits
6
Update remote references

Cloning a Repository

Cloning a Repository
1
Fetch remote refs (branches, tags)
2
Fetch commit graph
3
Fetch tree manifests
4
Fetch asset manifests
5
(Sparse) Mark required chunks
6
Fetch and verify chunks
7
Reconstruct working directory

Security Model

  • Authentication: JWT tokens with refresh, MFA support
  • Authorization: Repository-level permissions (admin, write, read)
  • Encryption at rest: Optional AES-256-GCM chunk encryption
  • Encryption in transit: TLS 1.3 for all connections
  • Integrity: All content verified by BLAKE3 hash

Detailed Documentation