Research, not a shipped AI product

These pages explore possible model and dataset applications of the generic Dits engine. AI-specific formats, workflows, remote sync, similarity layers, and recompute orchestration are not implemented.

Dits for AI research

Research notes

A design space for models, datasets, checkpoints, and scientific artifacts—not a second available Dits product.

Shared engine

The only reusable implementation today is the generic local Dits engine documented in the core product docs. It stores arbitrary exact bytes but has no AI-specific semantics.

Questions worth investigating

Which real model and dataset histories produce useful exact chunk reuse?
How should tensor layout and metadata be represented without losing source fidelity?
What inputs and environment make a derived artifact genuinely reproducible?
How can similarity aid search while remaining separate from exact identity?
Where should Dits interoperate with Xet, DVC, registries, and experiment trackers?

Required evidence

A useful proposal needs a redistributable corpus, exact workload generator, declared fidelity criteria, storage and decode measurements, recovery tests, and equivalent baselines. Modeled savings are not benchmark results.

Current boundaries

No tensor-aware chunk format or supported AI schema.
No model registry, experiment tracker, or pipeline orchestrator.
No similarity-addressed object identity.
No recompute service or reproducibility guarantee.
No network artifact transfer or hosted service.

Start with the research model, the benchmark gaps, and the core roadmap gates.