Architecture Overview

Dits is built as a layered system with a core engine handling content management, transport layer for network operations, and client interfaces for different use cases.

High-Level Architecture

Client Layer

CLI

(clap)

GUI

(Tauri)

SDK

(Rust)

VFS

(FUSE)

NLE Plugins

(Premiere/Resolve)

Core Engine

Hybrid Storage (libgit2 + FastCDC)

BLAKE3 Hashing

ISOBMFF Parsing

Manifest Management

Conflict Resolution

Transport Layer

QUIC (quinn)

Delta Sync

Bandwidth Estimation

Resume/Retry

LOCAL

(.dits/)

SERVER

(Axum)

PostgreSQL + Redis

STORAGE

(S3)

Hybrid Storage System

Dits uses a hybrid storage model that intelligently routes files to the appropriate storage engine based on their type:

File In

Classify

(by type)

Text File

(.md, .json)

Binary File

(.mp4, .mov)

Hybrid

(.prproj)

libgit2

Diff

Merge

Blame

FastCDC

Chunk

Dedup

Delta

Git + CDC

combined

Core Components

Content Store

The content-addressable storage layer handles chunking, hashing, deduplication, and compression of all repository data.

Learn about data structures →

Object Model

Commits, trees, assets, and chunks form a directed acyclic graph (DAG) that represents repository history.

Learn about objects →

Reference System

Branches, tags, and HEAD provide named pointers into the commit graph, enabling version navigation.

Learn about branches →

Transport Protocol

QUIC-based protocol for efficient chunk transfer with delta synchronization and resumable uploads.

Learn about protocol →

The Chunking Pipeline

Binary and media files pass through Dits' content-defined chunking pipeline for efficient storage and deduplication:

Chunking Pipeline

Binary File

FastCDC Chunker

BLAKE3 Hash

Content-Addressable Store

Binary File

FastCDC Chunker

BLAKE3 Hash

Content-Addressable Store

Deduplication Results

Move file A→B

0 bytes

(hashes match)

100% saved

Trim video start

~5% of file

(only start chunks change)

95% saved

Append to file

Size of append only

(existing chunks reused)

varies saved

Design Principles

Content-Addressable Storage

All data in Dits is identified by its cryptographic hash (BLAKE3). This provides:

Automatic deduplication: Identical content is stored once
Data integrity: Corruption is immediately detectable
Immutability: Content cannot be changed without changing its address
Parallel verification: Multiple sources can be verified independently

Format-Aware Processing

Unlike generic version control, Dits understands the structure of media files:

Container parsing: MP4/MOV atoms are preserved intact
Keyframe alignment: Chunks align to video I-frames
Metadata extraction: Duration, codec, resolution indexed
Temporal awareness: Changes tracked by timecode, not just bytes

Efficient Synchronization

The transport layer minimizes data transfer:

Delta sync: Only missing chunks are transferred
Parallel streams: Multiple chunks transfer simultaneously
Resumable: Interrupted transfers continue where they stopped
Bandwidth adaptive: Adjusts to network conditions

Crate Structure

Dits is organized into several Rust crates:

Crate

Purpose

dits-core/

Chunking, hashing, manifests, object model

dits-parsers/

ISOBMFF, NLE project file parsing

dits-storage/

Local and remote storage backends

dits-protocol/

Wire protocol, serialization

dits-client/

CLI implementation

dits-server/

REST API and QUIC server

dits-vfs/

FUSE virtual filesystem

dits-sdk/

Public Rust SDK

Data Flow

Adding a File

Read

From filesystem

Parse

Container format

Chunk

FastCDC + keyframes

Hash

BLAKE3

Dedup

Check existing

Compress

New chunks

Store

.dits/objects/

Manifest

Create asset

Stage

Update index

Read file from filesystem

Parse container format

If video file

Chunk using FastCDC

With keyframe alignment

Hash each chunk

BLAKE3

Check for existing chunks

Deduplication

Compress new chunks

Store in .dits/objects/chunks/

Create asset manifest

Update staging index

Pushing Changes

Enumerate commits to push

Get tree and asset hashes

Query remote for existing chunks

Upload only missing chunks (delta)

Upload manifests and commits

Update remote references

Cloning a Repository

Fetch remote refs (branches, tags)

Fetch commit graph

Fetch tree manifests

Fetch asset manifests

(Sparse) Mark required chunks

Fetch and verify chunks

Reconstruct working directory

Security Model

Authentication: JWT tokens with refresh, MFA support
Authorization: Repository-level permissions (admin, write, read)
Encryption at rest: Optional AES-256-GCM chunk encryption
Encryption in transit: TLS 1.3 for all connections
Integrity: All content verified by BLAKE3 hash

Detailed Documentation

Data Structures - Chunks, assets, commits, and manifests
Algorithms - FastCDC, BLAKE3, keyframe alignment
Network Protocol - QUIC transport, delta sync, API
Security - Authentication, encryption, and access control