diff --git a/docs/plans/plan_feature-software-hashes_implimentation.md b/docs/plans/plan_feature-software-hashes_implimentation.md new file mode 100644 index 0000000..18384d7 --- /dev/null +++ b/docs/plans/plan_feature-software-hashes_implimentation.md @@ -0,0 +1,53 @@ +# WIP: Software Hashes + +**Branch:** `feature/software-hashes` +**Started:** 2026-02-17 +**Status:** In Progress + +## Plan + +Implements [docs/plans/software-hashes.md](software-hashes.md) — a derived `software_hashes` table storing MD5, CRC32 and size for tape-image contents extracted from download zips. + +### Tasks + +- [ ] Create `data/zxdb/` directory (for JSON snapshot) +- [ ] Add `software_hashes` Drizzle schema model +- [ ] Create `bin/update-software-hashes.mjs` — main pipeline script + - [ ] DB query for tape-image downloads (filetype_id IN 8, 22) + - [ ] Resolve local zip path via CDN mapping + - [ ] Extract `_CONTENTS` (skip if exists) + - [ ] Find tape file (.tap/.tzx/.pzx/.csw) with priority order + - [ ] Compute MD5, CRC32, size_bytes + - [ ] Upsert into software_hashes + - [ ] State file for resume support +- [ ] JSON export after bulk update (atomic write) +- [ ] Update `bin/import_mysql.sh` to reimport snapshot on DB wipe +- [ ] Add pnpm script entries + +## Progress Log + +### 2026-02-17T16:00Z +- Started work. Branch created from `main` at `b361201`. +- Explored codebase: understood DB schema, CDN mapping, import pipeline. +- Key findings: + - filetype_id 8 = "Tape image" (33,427 rows), 22 = "BUGFIX tape image" (98 rows) + - CDN_CACHE = /Volumes/McFiver/CDN, paths: SC/ (zxdb) and WoS/ (pub) + - `_CONTENTS` dirs exist in WoS but not yet in SC + - data/zxdb/ directory needs creation + - import_mysql.sh needs software_hashes reimport step + +## Decisions & Notes + +- Target filetype IDs: 8 and 22 (tape image + bugfix tape image). +- Tape file priority: .tap > .tzx > .pzx > .csw (most common first). +- CDN_CACHE hard-coded to /Volumes/McFiver/CDN (same as sync-downloads). +- JSON snapshot at data/zxdb/software_hashes.json. +- Use Node.js built-in crypto for MD5, crc32 from buffer-based calculation. + +## Blockers + +None currently. + +## Commits + +b361201 - Ready to start adding hashes