Core Concepts

Tensor-Aware Chunking

Treating model weights as the float tensors they are, instead of as an opaque blob of bytes.

Warning

This is a roadmap and design topic, not a shipped feature. Today only L1 ships: exact content addressing, FastCDC deduplication, BLAKE3 hashing, and local history. Everything below describes where tensor-aware addressing is headed.

The honest weak spot of byte-level dedup

Content-defined chunking (see Chunking) shines when edits are local: insert a paragraph, append a row, patch a function, and only the chunks around that edit change. Everything else stays byte-identical and dedupes for free.

Model weights break that assumption. Weights are arrays of floats, and a single gradient step nudges almost every parameter by a tiny amount. The change is diffuse, not local — the bytes shift everywhere at once. That is precisely the case content-defined chunking is worst at, because no stable byte boundaries survive between one checkpoint and the next.

Note

Byte-CDC still wins big across related models — a fine-tune versus its base, or sibling variants that share frozen layers — because large contiguous regions stay identical. The weak case is specifically consecutive training checkpoints of the same run.

What tensor-aware chunking would do

Instead of feeding raw bytes to FastCDC, a tensor-aware path would parse the tensor container — a safetensors file, for example — and use its dtype and shape metadata to make smarter decisions:

Per-tensor chunking. Split on tensor boundaries so each weight matrix is addressed independently; an unchanged embedding table or frozen layer dedupes whole, regardless of churn elsewhere.
Dtype/shape-aware boundaries. Align chunk cuts to element and row strides rather than arbitrary byte offsets, so a re-quantization or a layout change does not desynchronize every downstream chunk.
Tensor-domain diffs. Compute deltas between checkpoints in the numeric domain — structured or low-rank residuals, or quantized deltas — and store only the compact difference, not a fresh copy of bytes that all moved a little.

checkpoint_1000.safetensors
  ├─ model.embed.weight   → blake3:ab12…  (unchanged → dedupes)
  ├─ model.layers.0.mlp    → blake3:cd34…  (diffuse change)
  └─ …
checkpoint_1001.safetensors
  ├─ model.embed.weight   → blake3:ab12…  (same address, 0 bytes stored)
  └─ model.layers.0.mlp    → delta(cd34… → ef56…)  (low-rank residual only)

Prior art

This is an active investment area in the ecosystem. Hugging Face's Xet storage layer leans into chunk-level dedup for model and dataset repos, and the broader push toward tensor-structure-aware storage is exactly the direction this concept generalizes.

Tip

For the bigger picture of how addressing layers stack, see Addressing and the How it works overview. Sequencing lives on the Roadmap.