Categories
Data Extraction & Web Scraping
Data Extraction & Web Scraping open-source alternatives curated by the directory.
Data Extraction & Web Scraping
Use the same directory search, ordering, and pagination controls inside this collection.
5 of 5 tools
Lightpanda
AGPL-3.0
Purpose-built headless browser that delivers 10x faster performance and 10x lower memory usage compared to Chrome headless for web automation and AI workflows.
Crawl4AI
Apache-2.0
Open-source web crawler and scraper that produces clean, structured output optimized for LLMs, RAG pipelines, and AI agents. Supports async crawling, CSS/XPath/LLM extraction, and stealth browser control.
Documind
Unknown
Documind uses advanced AI and LLMs to extract structured data from PDFs, images, and other documents, streamlining document processing and automation.
Firecrawl
AGPL-3.0
API for AI agents to search, scrape, crawl, and interact with the live web, returning clean Markdown, structured JSON, or screenshots from any page.
Maxun
AGPL-3.0
Train robots in 2 minutes to scrape web data automatically. No coding required. Handles pagination, CAPTCHAs, and layout changes with AI.




