Best Open-source Web Scraping & Crawling tools

Explore curated open-source tools in the Web Scraping & Crawling category. Compare technologies, see alternatives, and find the right solution for your workflow.

10+ projects · Page 1 of 1

Scrapling logo

Scrapling

Adaptive web scraping that survives site changes effortlessly

Stars
34,629
License
BSD-3-Clause
Last commit
17 days ago
PythonActive
Firecrawl logo

Firecrawl

Turn any website into clean, LLM‑ready data instantly

Stars
104,148
License
AGPL-3.0
Last commit
17 days ago
TypeScriptActive
Crawl4AI logo

Crawl4AI

Turn the web into clean, LLM-ready Markdown instantly

Stars
63,373
License
Apache-2.0
Last commit
18 days ago
PythonActive
ChangeDetection.io logo

ChangeDetection.io

Real-time website change monitoring with instant multi-channel alerts

Stars
30,966
License
Apache-2.0
Last commit
18 days ago
PythonActive
Katana logo

Katana

Fast, configurable web crawler with headless and JavaScript support

Stars
16,420
License
MIT
Last commit
18 days ago
GoActive
Scrapy logo

Scrapy

Fast, high-level Python framework for web crawling and scraping

Stars
61,075
License
BSD-3-Clause
Last commit
18 days ago
PythonActive
Maxun logo

Maxun

Train a web‑scraping robot in minutes, no code required

Stars
15,332
License
AGPL-3.0
Last commit
19 days ago
TypeScriptActive
Crawlee logo

Crawlee

Build fast, human-like web scrapers with a single library

Stars
22,649
License
Apache-2.0
Last commit
20 days ago
TypeScriptActive
AnyCrawl logo

AnyCrawl

High-performance web, site, and SERP crawler with AI extraction

Stars
2,783
License
MIT
Last commit
21 days ago
MDXActive
ScrapeGraphAI logo

ScrapeGraphAI

LLM‑powered web scraping pipelines in just five lines of code

Stars
23,218
License
MIT
Last commit
21 days ago
PythonActive
LLM Scraper logo

LLM Scraper

Extract structured data from any webpage using LLMs

Stars
6,253
License
MIT
Last commit
25 days ago
TypeScriptActive
Apache Nutch logo

Apache Nutch

Scalable, extensible Java web crawler for large‑scale data collection

Stars
3,144
License
Apache-2.0
Last commit
1 month ago
JavaActive
Colly logo

Colly

Fast, elegant web scraping framework for Go developers

Stars
25,204
License
Apache-2.0
Last commit
2 months ago
GoActive
WebMagic logo

WebMagic

Scalable Java crawler framework with flexible API and annotations

Stars
11,699
License
Apache-2.0
Last commit
4 months ago
JavaStable
AutoScraper logo

AutoScraper

Automatic, fast, lightweight web scraper that learns from examples

Stars
7,132
License
MIT
Last commit
10 months ago
PythonStable