feature/add supplements: Adds first-class support for "supplement" pages#6
Open
heroheman wants to merge 4 commits intoscp-data:mainfrom
Open
feature/add supplements: Adds first-class support for "supplement" pages#6heroheman wants to merge 4 commits intoscp-data:mainfrom
heroheman wants to merge 4 commits intoscp-data:mainfrom
Conversation
This was referenced Dec 17, 2025
f257804 to
d3ae165
Compare
- Introduce ScpSupplement class for item representation - Implement ScpSupplementSpider to crawl supplement pages - Update makefile to include supplement in data targets
- Implement run_postproc_supplement to process SCP supplement data - Create necessary directories and handle data extraction - Store processed supplements in JSON format for further use
- Added instructions for crawling pages tagged as 'supplement' - Updated content structure to include multiple content types - Clarified post-processing details for supplements
d3ae165 to
e9f6ba0
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Description
Adds first-class support for "supplement" pages from the SCP Wiki, enabling automated crawling and post-processing of supplementary content.
Changes
ScpSupplementclass initems.pyScpSupplementSpiderto crawl pages tagged withsupplementfromhttps://scp-wiki.wikidot.com/system:page-tags/tag/supplementrun_postproc_supplement()command that:planetfall-7)data/processed/supplement/withindex.jsonandcontent_supplement.jsonscp_crawlandscp_postprocesstargetsOutput Structure
{ "planetfall-7": { "title": "Planetfall: 7", "parent_scp": null, "parent_tale": "planetfall", "created_at": "2023-05-15T14:30:00", ... } }Testing