Tool for ensuring integrity and error correction

Find a file

NicolasHorde 976032187a doc: -auto option		2025-12-20 19:49:19 +01:00
.gitignore	fix: correct auto command	2025-12-20 14:29:48 +01:00
go.mod	first commit	2025-12-20 10:18:15 +01:00
main.go	feat: auto choose params to save space	2025-12-20 19:46:45 +01:00
make.sh	feat: auto and config functions	2025-12-20 13:11:56 +01:00
README.md	doc: -auto option	2025-12-20 19:49:19 +01:00

README.md

BLIND

BLIND — Bit-level Long-term Integrity with Non-destructive Detection

BLIND is a file-level integrity and resilience tool designed to detect and repair silent data corruption over the long term, without relying on filesystem-level features or kernel modules.

It operates above the filesystem, using cryptographic hashes and parity data to ensure that what you read tomorrow is exactly what you wrote years ago.

Why BLIND exists

Modern filesystems (including APFS) and RAID configurations protect against disk failures, but most of them do not protect against:

silent bit rot
latent sector corruption
controller or firmware errors
memory or DMA faults
long-term degradation on external or cold storage

In many cases, corrupted data is returned without any error.

BLIND addresses this blind spot by providing explicit, verifiable integrity checks at the file level.

Core principles

Bit-level integrity
Corruption is detected at the bit level using cryptographic hashes.
Non-destructive detection
BLIND never modifies data during verification. Detection is always safe.
Filesystem-independent
Works on APFS, ext4, NTFS, FAT, exFAT, network mounts, USB disks, NAS, backups, archives.
Portable by design
Integrity metadata travels with the data. Copy the folder, keep the protection.
Explicit repair
Repair is deliberate and offline, never automatic or hidden.

How it works

For each directory, BLIND generates (unless stated otherwise):

.BLAKE3SUMS
Hashes for files ≥ 4 KiB (fast, strong, scalable)
.SHA256SUMS
Hashes for files < 4 KiB
.MANIFEST.json
Snapshot of file paths, sizes and modification times
.<directory>.par2 and .<directory>.vol*.par2
PAR2 parity files for error correction and recovery

All generated files are excluded from hashing and parity generation.

Commands

Encode (create or update integrity data)

blind encode [options] <folder>

Generates hashes, manifest and PAR2 (unless -hash-only)
Skips unchanged folders unless -f is used
Cleans old PAR2 files before recreating them

Scan (update only modified folders)

blind scan [options] <folder>

Scans folders
Re-encodes only those that changed
Ideal for periodic maintenance
Supports -hash-only

Verify (detect corruption)

blind verify [options] <folder>

Verifies hashes (BLAKE3 / SHA256)
Verifies PAR2 if present
Never stops on first error
Collects all integrity issues
With -report, always prints a verification report

Repair (recover corrupted files)

blind repair [options] <folder>

Runs verification first
Uses PAR2 to repair corrupted or missing data
Verifies again after repair
Reports remaining issues if any

Clean (remove all generated files)

blind clean [options] <folder>

Removes all BLIND-generated metadata and PAR2 files
Leaves original data untouched

Stat (storage overhead analysis)

blind stat [options] <folder>

Displays payload size
Displays hash overhead
Displays PAR2 overhead
Uses human-readable units
Shows total overhead ratio

Important options

General

-r
Recurse into subdirectories (each directory is handled independently)
-exclude a,b,c
Exclude directories by basename when using -r
Example: -exclude node_modules,.git,dist
-v
Verbose output
-q
Quiet mode

Performance

-j N
Number of parallel workers for file hashing
(recommended: 1–2 for HDD/USB, 4–8 for SSD/NVMe)
-jd M
Number of directories processed in parallel when using -r
(recommended: 1–2)

Integrity & resilience

-small N
Threshold in bytes between SHA256 and BLAKE3 (default: 4096)
-parr N
PAR2 redundancy percentage (default: 20)
-par2 <path>
Path to the par2 executable
-hash-only
Generate hashes + manifest only, no PAR2
Old PAR2 files are removed to avoid stale protection
-auto Enable smart automatic mode.

In this mode, BLIND automatically adjusts its behavior based on the detected payload size of each directory.

The following parameters are computed dynamically: - hash-only mode - PAR2 redundancy ratio (-parr) - manifest creation

This option overrides: - -hash-only - -parr

Verification

-report
Always display the verification report, even when no issues are found

Voici uniquement la section en Markdown pur, sans texte autour, prête à être intégrée telle quelle dans ton README.md 👇

Automation

Config (write `.blind.yaml`)

blind config [options] <folder>

Creates a .blind.yaml configuration file in the current directory.

The configuration stores:

absolute target folder
recursion mode
hash-only mode
size threshold (small)
PAR2 settings (parr, par2)
concurrency settings (j, jd)
excluded directories

This allows running blind auto later without specifying the folder again.

Example:

blind config -r -exclude .git,node_modules -parr 20 /data/archive

Auto (verify → detect changes → act)

blind auto

Runs an automated integrity workflow based on .blind.yaml.

Workflow:

Verify

Verifies hashes (BLAKE3 / SHA256)
Verifies PAR2 if present
Counts scanned files
Never stops on first error

Detect changes

Compares current files with .MANIFEST.json
Detects added, deleted or modified files
Tracks affected directories

Decision (interactive)

Actions:
  [Enter] : do nothing (default)
  s       : scan (add/update after changes)
  r       : repair (try to repair verify failures)

Actions

Scan (s)
Re-encodes only modified directories
Repair (r)
Attempts repair using PAR2
Re-verifies after repair

Final status

Exits cleanly if integrity is restored
Returns non-zero status if verification still fails

What BLIND is (and is not)

BLIND is

an integrity and resilience layer
suitable for archives, backups, external drives
safe on macOS (no kernel extensions)
deterministic and auditable

BLIND is not

a filesystem
a real-time protection mechanism
a replacement for backups
a RAID system

Comparison with other approaches

Feature	BLIND	APFS RAID 1	ZFS
Bit-level checksums	Yes	No	Yes
Silent corruption detection	Yes	No	Yes
Error correction	Yes (PAR2)	No	Yes
Real-time protection	No	Yes	Yes
macOS native	Yes	Yes	No
Kernel dependencies	No	No	Yes
Portability	Excellent	Poor	Poor

BLIND complements filesystems and RAID — it does not replace them.

Typical use cases

Long-term archives
Photo and video collections
Research data
Source code archives
External and removable storage
Cold backups
Data that must remain correct years later

Philosophy

Storage systems are fast and reliable — until they are silently wrong.
BLIND exists to make corruption visible, verifiable, and recoverable.

License

TBD

Author

Nicolas Hordé : nicolas.horde@linux.com

BLIND was designed for correctness, portability, and long-term trust in data.

README.md Unescape Escape