
Apify
Web automation & scraping platform powered by serverless Actors
Discover top open-source software, updated regularly with real-world adoption signals.

Adaptive web scraping that survives site changes effortlessly
Scrapling delivers adaptive web scraping that automatically adjusts to site redesigns, offering stealth, dynamic, and async fetchers, a fast parser, and a CLI for both developers and non‑programmers.

Scrapling is a Python library built for developers and data teams who need reliable web scrapers that keep working as websites evolve. Its adaptive selector engine learns from structural changes, eliminating the constant need to rewrite CSS or XPath queries.
The library ships with multiple fetchers—standard, stealthy, and full‑browser dynamic—each capable of handling anti‑bot measures, TLS fingerprint impersonation, and headless operation. A rapid parsing engine supports CSS, XPath, and BeautifulSoup‑style selectors, plus advanced navigation methods like sibling and similarity queries. Async sessions enable high‑throughput crawling, while a powerful CLI lets non‑programmers extract content directly to markdown, text, or HTML files.
Install via pip install scrapling and integrate it into scripts, Jupyter notebooks, or CI pipelines. Use context‑aware sessions for persistent cookies and headers, or one‑off fetch calls for quick tasks. Browser‑based fetchers require a compatible Chromium/Firefox binary, which Scrapling can launch automatically in headless mode.
When teams consider Scrapling, these hosted platforms usually appear on the same shortlist.
Looking for a hosted option? These are the services engineering teams benchmark against before choosing open source.
E‑commerce price monitoring
Continues to collect product prices even after the retailer redesigns its layout, eliminating selector rewrites.
News aggregation behind Cloudflare
StealthyFetcher bypasses Cloudflare challenges to reliably extract article headlines and bodies.
Bulk data collection with async concurrency
AsyncStealthySession fetches hundreds of pages in parallel, dramatically reducing total crawl time.
Non‑programmer report generation via CLI
Users run `scrapling extract` to export web page content directly to markdown or text files without writing code.
When `adaptive=True` is set, Scrapling analyzes the page structure and attempts to locate the target element even if its original CSS or XPath path has changed, using similarity heuristics.
Yes, these fetchers launch a headless Chromium or Firefox instance. Scrapling will download a compatible binary if none is found on the system.
Absolutely. The library provides `AsyncStealthySession` and `AsyncDynamicSession` classes that integrate with `asyncio` for concurrent fetching.
The built‑in CLI (`scrapling extract`) lets you fetch pages and export content to HTML, markdown, or plain text directly from the command line.
Scrapling follows the standard Python packaging policy and supports the versions listed on its PyPI page; refer to the official documentation for the exact range.
Project at a glance
ActiveLast synced 4 days ago