Ready to start adding hashes

This commit is contained in:
2026-02-17 15:53:42 +00:00
parent b158bfc4a0
commit b361201cf2
2 changed files with 155 additions and 36 deletions

View File

@@ -1,36 +0,0 @@
Improve ZXDB downloads with local mirroring and inline preview
This commit implements a comprehensive local file mirror system for
ZXDB and WoS downloads, allowing users to access local archives
directly through the explorer UI.
Key Changes:
Local File Mirroring & Proxy:
- Added `ZXDB_LOCAL_FILEPATH` and `WOS_LOCAL_FILEPATH` to `src/env.ts`
and `example.env` for opt-in local mirroring.
- Implemented `resolveLocalLink` in `src/server/repo/zxdb.ts` to map
database `file_link` paths to local filesystem paths based on
configurable prefixes.
- Created `src/app/api/zxdb/download/route.ts` to safely proxy local
files, preventing path traversal and serving with appropriate
`Content-Type` and `Content-Disposition`.
- Updated `docs/ZXDB.md` with setup instructions and resolution logic.
UI Enhancements & Grouping:
- Grouped downloads and scraps by type (e.g., Inlay, Game manual, Tape
image) in `EntryDetail.tsx` and `ReleaseDetail.tsx` for better
organization.
- Introduced `FileViewer.tsx` component to provide inline previews
for supported formats (.txt, .nfo, .png, .jpg, .gif, .pdf).
- Added a "Preview" button for local mirrors of supported file types.
- Optimized download tables with badge-style links for local/remote
sources.
Guideline Updates:
- Updated `AGENTS.md` to clarify commit message handling: edit or
append to `COMMIT_EDITMSG` instead of overwriting.
- Re-emphasized testing rules: use `tsc --noEmit`, do not restart
dev-server, and avoid `pnpm build` during development.
Signed-off-by: junie@lucy.xalior.com

View File

@@ -0,0 +1,155 @@
# Software Hashes Plan
Plan for adding a derived `software_hashes` table, its update pipeline, and JSON snapshot lifecycle to survive DB wipes.
---
## 1) Goals and Scope (Plan Step 1)
- Create and maintain `software_hashes` for (at this stage) tape-image downloads.
- Preserve existing `_CONTENTS` folders; only create missing ones.
- Export `software_hashes` to JSON after each bulk update.
- Reimport `software_hashes` JSON during DB wipe in `bin/import_mysql.sh` (or a helper script it invokes).
- Ensure all scripts are idempotent and resume-safe.
---
## 2) Confirm Pipeline Touchpoints (Plan Step 2)
- Verify `bin/import_mysql.sh` is the authoritative DB wipe/import entry point.
- Confirm `bin/sync-downloads.mjs` remains responsible only for CDN cache sync.
- Confirm `src/server/schema/zxdb.ts` uses `downloads.id` as the natural FK target.
---
## 3) Define Data Model: `software_hashes` (Plan Step 3)
### Table naming and FK alignment
- Table: `software_hashes`.
- FK: `download_id``downloads.id`.
- Column names follow existing DB `snake_case` conventions.
### Planned columns
- `download_id` (PK or unique index; FK to `downloads.id`)
- `md5`
- `crc32`
- `size_bytes`
- `updated_at`
### Planned indexes / constraints
- Unique index on `download_id`.
- Index on `md5` for reverse lookup.
- Index on `crc32` for reverse lookup.
---
## 4) Define JSON Snapshot Format (Plan Step 4)
### Location
- Default: `data/zxdb/software_hashes.json` (or another agreed path).
### Structure
```json
{
"exportedAt": "2026-02-17T15:18:00.000Z",
"rows": [
{
"download_id": 123,
"md5": "...",
"crc32": "...",
"size_bytes": 12345,
"updated_at": "2026-02-17T15:18:00.000Z"
}
]
}
```
### Planned import policy
- If snapshot exists: truncate `software_hashes` and bulk insert.
- If snapshot missing: log and continue without error.
---
## 5) Implement Tape Image Update Workflow (Plan Step 5)
### Planned script
- `bin/update-software-hashes.mjs` (name can be adjusted).
### Planned input dataset
- Query `downloads` for tape-image rows (filter by `filetype_id` or joined `filetypes` table).
### Planned per-item process
1. Resolve local zip path using the same CDN mapping used by `sync-downloads`.
2. Compute `_CONTENTS` folder name: `<zip filename>_CONTENTS` (exact match).
3. If `_CONTENTS` exists, keep it untouched.
4. If missing, extract zip into `_CONTENTS` using a library that avoids shell expansion issues with brackets.
5. Locate tape file inside (`.tap`, `.tzx`, `.pzx`, `.csw`):
- Apply a deterministic priority order.
- If multiple candidates remain, log and skip (or record ambiguity).
6. Compute `md5`, `crc32`, and `size_bytes` for the selected file.
7. Upsert into `software_hashes` keyed by `download_id`.
### Planned error handling
- Log missing zips or missing tape files.
- Continue after recoverable errors; fail only on critical DB errors.
---
## 6) Implement JSON Export Lifecycle (Plan Step 6)
- After each bulk update, export `software_hashes` to JSON.
- Write atomically (temp file + rename).
- Include `exportedAt` timestamp in snapshot.
---
## 7) Reimport During Wipe (`bin/import_mysql.sh`) (Plan Step 7)
### Planned placement
- Immediately after database creation and ZXDB SQL import completes.
### Planned behavior
- Attempt to read JSON snapshot.
- If present, truncate and reinsert `software_hashes`.
- Log imported row count.
---
## 8) Add Idempotency and Resume Support (Plan Step 8)
- State file similar to `.sync-downloads.state.json` to track last `download_id` processed.
- CLI flags:
- `--resume` (default)
- `--start-from-id`
- `--rebuild-all`
- Reprocess when zip file size or mtime changes.
---
## 9) Validation Checklist (Plan Step 9)
- `_CONTENTS` folders are never deleted.
- Hashes match expected MD5/CRC32 for known samples.
- JSON snapshot is created and reimported correctly.
- Reverse lookup by `md5`/`crc32`/`size_bytes` identifies misnamed files.
- Script can resume safely after interruption.
---
## 10) Open Questions / Confirmations (Plan Step 10)
- Final `software_hashes` column list and types.
- Exact JSON snapshot path.
- Filetype IDs that map to “Tape Image” in `downloads`.