Tool for ensuring integrity and error correction
Find a file
2025-12-20 19:49:19 +01:00
.gitignore fix: correct auto command 2025-12-20 14:29:48 +01:00
go.mod first commit 2025-12-20 10:18:15 +01:00
main.go feat: auto choose params to save space 2025-12-20 19:46:45 +01:00
make.sh feat: auto and config functions 2025-12-20 13:11:56 +01:00
README.md doc: -auto option 2025-12-20 19:49:19 +01:00

BLIND

BLIND — Bit-level Long-term Integrity with Non-destructive Detection

BLIND is a file-level integrity and resilience tool designed to detect and repair silent data corruption over the long term, without relying on filesystem-level features or kernel modules.

It operates above the filesystem, using cryptographic hashes and parity data to ensure that what you read tomorrow is exactly what you wrote years ago.


Why BLIND exists

Modern filesystems (including APFS) and RAID configurations protect against disk failures, but most of them do not protect against:

  • silent bit rot
  • latent sector corruption
  • controller or firmware errors
  • memory or DMA faults
  • long-term degradation on external or cold storage

In many cases, corrupted data is returned without any error.

BLIND addresses this blind spot by providing explicit, verifiable integrity checks at the file level.


Core principles

  • Bit-level integrity
    Corruption is detected at the bit level using cryptographic hashes.

  • Non-destructive detection
    BLIND never modifies data during verification. Detection is always safe.

  • Filesystem-independent
    Works on APFS, ext4, NTFS, FAT, exFAT, network mounts, USB disks, NAS, backups, archives.

  • Portable by design
    Integrity metadata travels with the data. Copy the folder, keep the protection.

  • Explicit repair
    Repair is deliberate and offline, never automatic or hidden.


How it works

For each directory, BLIND generates (unless stated otherwise):

  • .BLAKE3SUMS
    Hashes for files ≥ 4 KiB (fast, strong, scalable)

  • .SHA256SUMS
    Hashes for files < 4 KiB

  • .MANIFEST.json
    Snapshot of file paths, sizes and modification times

  • .<directory>.par2 and .<directory>.vol*.par2
    PAR2 parity files for error correction and recovery

All generated files are excluded from hashing and parity generation.


Commands

Encode (create or update integrity data)

blind encode [options] <folder>
  • Generates hashes, manifest and PAR2 (unless -hash-only)
  • Skips unchanged folders unless -f is used
  • Cleans old PAR2 files before recreating them

Scan (update only modified folders)

blind scan [options] <folder>
  • Scans folders
  • Re-encodes only those that changed
  • Ideal for periodic maintenance
  • Supports -hash-only

Verify (detect corruption)

blind verify [options] <folder>
  • Verifies hashes (BLAKE3 / SHA256)
  • Verifies PAR2 if present
  • Never stops on first error
  • Collects all integrity issues
  • With -report, always prints a verification report

Repair (recover corrupted files)

blind repair [options] <folder>
  • Runs verification first
  • Uses PAR2 to repair corrupted or missing data
  • Verifies again after repair
  • Reports remaining issues if any

Clean (remove all generated files)

blind clean [options] <folder>
  • Removes all BLIND-generated metadata and PAR2 files
  • Leaves original data untouched

Stat (storage overhead analysis)

blind stat [options] <folder>
  • Displays payload size
  • Displays hash overhead
  • Displays PAR2 overhead
  • Uses human-readable units
  • Shows total overhead ratio

Important options

General

  • -r
    Recurse into subdirectories (each directory is handled independently)

  • -exclude a,b,c
    Exclude directories by basename when using -r
    Example: -exclude node_modules,.git,dist

  • -v
    Verbose output

  • -q
    Quiet mode


Performance

  • -j N
    Number of parallel workers for file hashing
    (recommended: 12 for HDD/USB, 48 for SSD/NVMe)

  • -jd M
    Number of directories processed in parallel when using -r
    (recommended: 12)


Integrity & resilience

  • -small N
    Threshold in bytes between SHA256 and BLAKE3 (default: 4096)

  • -parr N
    PAR2 redundancy percentage (default: 20)

  • -par2 <path>
    Path to the par2 executable

  • -hash-only
    Generate hashes + manifest only, no PAR2
    Old PAR2 files are removed to avoid stale protection

  • -auto Enable smart automatic mode.

    In this mode, BLIND automatically adjusts its behavior based on the detected payload size of each directory.

    The following parameters are computed dynamically: - hash-only mode - PAR2 redundancy ratio (-parr) - manifest creation

    This option overrides: - -hash-only - -parr


Verification

  • -report
    Always display the verification report, even when no issues are found

Voici uniquement la section en Markdown pur, sans texte autour, prête à être intégrée telle quelle dans ton README.md 👇

Automation

Config (write .blind.yaml)

blind config [options] <folder>

Creates a .blind.yaml configuration file in the current directory.

The configuration stores:

  • absolute target folder
  • recursion mode
  • hash-only mode
  • size threshold (small)
  • PAR2 settings (parr, par2)
  • concurrency settings (j, jd)
  • excluded directories

This allows running blind auto later without specifying the folder again.

Example:

blind config -r -exclude .git,node_modules -parr 20 /data/archive

Auto (verify → detect changes → act)

blind auto

Runs an automated integrity workflow based on .blind.yaml.

Workflow:

  1. Verify
  • Verifies hashes (BLAKE3 / SHA256)
  • Verifies PAR2 if present
  • Counts scanned files
  • Never stops on first error
  1. Detect changes
  • Compares current files with .MANIFEST.json
  • Detects added, deleted or modified files
  • Tracks affected directories
  1. Decision (interactive)
Actions:
  [Enter] : do nothing (default)
  s       : scan (add/update after changes)
  r       : repair (try to repair verify failures)
  1. Actions
  • Scan (s)
  • Re-encodes only modified directories
  • Repair (r)
  • Attempts repair using PAR2
  • Re-verifies after repair
  1. Final status
  • Exits cleanly if integrity is restored
  • Returns non-zero status if verification still fails

What BLIND is (and is not)

BLIND is

  • an integrity and resilience layer
  • suitable for archives, backups, external drives
  • safe on macOS (no kernel extensions)
  • deterministic and auditable

BLIND is not

  • a filesystem
  • a real-time protection mechanism
  • a replacement for backups
  • a RAID system

Comparison with other approaches

Feature BLIND APFS RAID 1 ZFS
Bit-level checksums Yes No Yes
Silent corruption detection Yes No Yes
Error correction Yes (PAR2) No Yes
Real-time protection No Yes Yes
macOS native Yes Yes No
Kernel dependencies No No Yes
Portability Excellent Poor Poor

BLIND complements filesystems and RAID — it does not replace them.


Typical use cases

  • Long-term archives
  • Photo and video collections
  • Research data
  • Source code archives
  • External and removable storage
  • Cold backups
  • Data that must remain correct years later

Philosophy

Storage systems are fast and reliable — until they are silently wrong.
BLIND exists to make corruption visible, verifiable, and recoverable.


License

TBD


Author

Nicolas Hordé : nicolas.horde@linux.com

BLIND was designed for correctness, portability, and long-term trust in data.