Best Open-source Web Scraping & Crawling tools

Explore curated open-source tools in the Web Scraping & Crawling category. Compare technologies, see alternatives, and find the right solution for your workflow.

10+ projects · Page 1 of 1

Firecrawl logo

Firecrawl

Turn any website into clean, LLM‑ready data instantly

Stars
89,174
License
AGPL-3.0
Last commit
7 hours ago
TypeScriptActive
Scrapling logo

Scrapling

Adaptive web scraping that survives site changes effortlessly

Stars
25,451
License
BSD-3-Clause
Last commit
7 hours ago
PythonActive
Crawl4AI logo

Crawl4AI

Turn the web into clean, LLM-ready Markdown instantly

Stars
61,499
License
Apache-2.0
Last commit
14 hours ago
PythonActive
Maxun logo

Maxun

Train a web‑scraping robot in minutes, no code required

Stars
15,206
License
AGPL-3.0
Last commit
16 hours ago
TypeScriptActive
Katana logo

Katana

Fast, configurable web crawler with headless and JavaScript support

Stars
15,790
License
MIT
Last commit
1 day ago
GoActive
Crawlee logo

Crawlee

Build fast, human-like web scrapers with a single library

Stars
22,083
License
Apache-2.0
Last commit
1 day ago
TypeScriptActive
AnyCrawl logo

AnyCrawl

High-performance web, site, and SERP crawler with AI extraction

Stars
2,758
License
MIT
Last commit
1 day ago
TypeScriptActive
ChangeDetection.io logo

ChangeDetection.io

Real-time website change monitoring with instant multi-channel alerts

Stars
30,469
License
Apache-2.0
Last commit
1 day ago
PythonActive
LLM Scraper logo

LLM Scraper

Extract structured data from any webpage using LLMs

Stars
6,225
License
MIT
Last commit
4 days ago
TypeScriptActive
Scrapy logo

Scrapy

Fast, high-level Python framework for web crawling and scraping

Stars
60,634
License
BSD-3-Clause
Last commit
5 days ago
PythonActive
Apache Nutch logo

Apache Nutch

Scalable, extensible Java web crawler for large‑scale data collection

Stars
3,139
License
Apache-2.0
Last commit
9 days ago
JavaActive
ScrapeGraphAI logo

ScrapeGraphAI

LLM‑powered web scraping pipelines in just five lines of code

Stars
22,878
License
MIT
Last commit
11 days ago
PythonActive
Colly logo

Colly

Fast, elegant web scraping framework for Go developers

Stars
25,144
License
Apache-2.0
Last commit
18 days ago
GoActive
WebMagic logo

WebMagic

Scalable Java crawler framework with flexible API and annotations

Stars
11,702
License
Apache-2.0
Last commit
2 months ago
JavaActive
AutoScraper logo

AutoScraper

Automatic, fast, lightweight web scraper that learns from examples

Stars
7,110
License
MIT
Last commit
9 months ago
PythonStable