Add resumable checkpoint support so long scrapes can recover from interruptions instead of restarting from scratch. - introduce autosave/load/clear checkpoint flow in `.cache/scrape-state.json`, including SIGINT/SIGTERM save-on-exit handling - expand parsing/model output to capture legacy and portable infobox fields, primary image URLs, effects, recipes, raw tables, and improved category extraction - skip infobox tables during recipe parsing to avoid false recipe matches - add cache log event type, ignore cache/output artifacts, and document new autosave tuning options in READMEfeat(scraper): add checkpointing and richer page extraction Add resumable checkpoint support so long scrapes can recover from interruptions instead of restarting from scratch. - introduce autosave/load/clear checkpoint flow in `.cache/scrape-state.json`, including SIGINT/SIGTERM save-on-exit handling - expand parsing/model output to capture legacy and portable infobox fields, primary image URLs, effects, recipes, raw tables, and improved category extraction - skip infobox tables during recipe parsing to avoid false recipe matches - add cache log event type, ignore cache/output artifacts, and document new autosave tuning options in README
38 lines
687 B
Plaintext
38 lines
687 B
Plaintext
*.exe
|
|
*.exe~
|
|
*.dll
|
|
*.so
|
|
*.dylib
|
|
|
|
# Test binary, built with `go test -c`
|
|
*.test
|
|
|
|
# Output of the go coverage tool, specifically when used with LiteIDE
|
|
*.out
|
|
|
|
# Dependency directories (remove the comment below to include it)
|
|
# vendor/
|
|
|
|
# Go workspace file
|
|
go.work
|
|
|
|
### Linux ###
|
|
*~
|
|
|
|
# temporary files which can be created if a process still has a handle open of a deleted file
|
|
.fuse_hidden*
|
|
|
|
# KDE directory preferences
|
|
.directory
|
|
|
|
# Linux trash folder which might appear on any partition or disk
|
|
.Trash-*
|
|
|
|
# .nfs files are created when an open file is removed but is still being accessed
|
|
.nfs*
|
|
|
|
# End of https://www.toptal.com/developers/gitignore/api/go,linux
|
|
|
|
.cache/
|
|
|
|
outward_data.json |