Best Open-source Web Scraping & Crawling tools

Explore curated open-source tools in the Web Scraping & Crawling category. Compare technologies, see alternatives, and find the right solution for your workflow.

10+ projects · Page 1 of 1

Crawlee logo

Crawlee

Build fast, human-like web scrapers with a single library

Stars
20,704
License
Apache-2.0
Last commit
3 days ago
TypeScriptActive
Firecrawl logo

Firecrawl

Turn any website into clean, LLM‑ready data instantly

Stars
68,941
License
AGPL-3.0
Last commit
3 days ago
TypeScriptActive
Maxun logo

Maxun

Train a web‑scraping robot in minutes, no code required

Stars
13,950
License
AGPL-3.0
Last commit
4 days ago
TypeScriptActive
Katana logo

Katana

Fast, configurable web crawler with headless and JavaScript support

Stars
14,885
License
MIT
Last commit
4 days ago
GoActive
ScrapeGraphAI logo

ScrapeGraphAI

LLM‑powered web scraping pipelines in just five lines of code

Stars
21,921
License
MIT
Last commit
5 days ago
PythonActive
Scrapy logo

Scrapy

Fast, high-level Python framework for web crawling and scraping

Stars
59,114
License
BSD-3-Clause
Last commit
5 days ago
PythonActive
Apache Nutch logo

Apache Nutch

Scalable, extensible Java web crawler for large‑scale data collection

Stars
3,093
License
Apache-2.0
Last commit
6 days ago
JavaActive
Scrapling logo

Scrapling

Adaptive web scraping that survives site changes effortlessly

Stars
8,266
License
BSD-3-Clause
Last commit
9 days ago
PythonActive
Colly logo

Colly

Fast, elegant web scraping framework for Go developers

Stars
24,863
License
Apache-2.0
Last commit
11 days ago
GoActive
AnyCrawl logo

AnyCrawl

High-performance web, site, and SERP crawler with AI extraction

Stars
2,461
License
MIT
Last commit
11 days ago
TypeScriptActive
WebMagic logo

WebMagic

Scalable Java crawler framework with flexible API and annotations

Stars
11,663
License
Apache-2.0
Last commit
26 days ago
JavaActive
LLM Scraper logo

LLM Scraper

Extract structured data from any webpage using LLMs

Stars
6,116
License
MIT
Last commit
28 days ago
TypeScriptActive
AutoScraper logo

AutoScraper

Automatic, fast, lightweight web scraper that learns from examples

Stars
7,040
License
MIT
Last commit
5 months ago
PythonStable