Pipeline orchestrator #4

Open
opened 2026-06-14 18:19:29 +00:00 by glow · 0 comments
Owner

MiningPipeline chains: URL list -> DomainThrottle -> Firecrawl -> sibyl-extractor -> quality gate -> SibylStore. CLI via web_miner/cli.py with --file, --rate, --delay, --min-words, --max options. Tests: 10/10 passing.

MiningPipeline chains: URL list -> DomainThrottle -> Firecrawl -> sibyl-extractor -> quality gate -> SibylStore. CLI via web_miner/cli.py with --file, --rate, --delay, --min-words, --max options. Tests: 10/10 passing.
Sign in to join this conversation.
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference
glow-all/sibyl-web-miner#4
No description provided.