Best Open-source Web Scraping & Crawling tools

Explore curated open-source tools in the Web Scraping & Crawling category. Compare technologies, see alternatives, and find the right solution for your workflow.

10+ projects · Page 1 of 1

Crawlee logo

Crawlee

Build fast, human-like web scrapers with a single library

Stars
21,214
License
Apache-2.0
Last commit
1 hour ago
TypeScriptActive
Firecrawl logo

Firecrawl

Turn any website into clean, LLM‑ready data instantly

Stars
76,406
License
AGPL-3.0
Last commit
1 hour ago
TypeScriptActive
Apache Nutch logo

Apache Nutch

Scalable, extensible Java web crawler for large‑scale data collection

Stars
3,114
License
Apache-2.0
Last commit
9 hours ago
JavaActive
Maxun logo

Maxun

Train a web‑scraping robot in minutes, no code required

Stars
14,162
License
AGPL-3.0
Last commit
19 hours ago
TypeScriptActive
Scrapling logo

Scrapling

Adaptive web scraping that survives site changes effortlessly

Stars
8,824
License
BSD-3-Clause
Last commit
20 hours ago
PythonActive
ScrapeGraphAI logo

ScrapeGraphAI

LLM‑powered web scraping pipelines in just five lines of code

Stars
22,341
License
MIT
Last commit
1 day ago
PythonActive
Scrapy logo

Scrapy

Fast, high-level Python framework for web crawling and scraping

Stars
59,514
License
BSD-3-Clause
Last commit
1 day ago
PythonActive
Katana logo

Katana

Fast, configurable web crawler with headless and JavaScript support

Stars
15,422
License
MIT
Last commit
2 days ago
GoActive
Colly logo

Colly

Fast, elegant web scraping framework for Go developers

Stars
25,016
License
Apache-2.0
Last commit
15 days ago
GoActive
AnyCrawl logo

AnyCrawl

High-performance web, site, and SERP crawler with AI extraction

Stars
2,529
License
MIT
Last commit
19 days ago
TypeScriptActive
WebMagic logo

WebMagic

Scalable Java crawler framework with flexible API and annotations

Stars
11,689
License
Apache-2.0
Last commit
1 month ago
JavaActive
LLM Scraper logo

LLM Scraper

Extract structured data from any webpage using LLMs

Stars
6,164
License
MIT
Last commit
1 month ago
TypeScriptActive
AutoScraper logo

AutoScraper

Automatic, fast, lightweight web scraper that learns from examples

Stars
7,074
License
MIT
Last commit
7 months ago
PythonStable