The scraper that re-finds your elements after a redesign

Every scraper rots the same way: the target site ships a redesign, your CSS selectors point at nodes that no longer exist, and the pipeline silently returns empty. Scrapling’s headline feature is a direct answer to that decay. Mark an element with auto_save=True, and when the page later changes you pass adaptive=True and the parser relocates the element by what it learned about it, rather than by a brittle path that just broke.

from scrapling.fetchers import StealthyFetcher
StealthyFetcher.adaptive = True
page = StealthyFetcher.fetch('https://example.com', headless=True, network_idle=True)
products = page.css('.product', auto_save=True)
# later, after the site changes its markup:
products = page.css('.product', adaptive=True)

That single idea, selectors that tolerate change, is what separates Scrapling from the parse-and-pray approach most scraping code takes. Everything else in the framework is built around making that adaptive layer usable at any scale.

Three layers, one library

Scrapling is deliberately a full stack rather than a parser alone:

  • Fetchers range from fast HTTP requests that impersonate a browser’s TLS fingerprint and headers (and can speak HTTP/3), through full browser automation via Playwright’s Chromium, to StealthyFetcher, which is built to pass Cloudflare’s Turnstile and similar interstitials.
  • Sessions persist cookies and state across requests, with FetcherSession, StealthySession, and DynamicSession variants, plus a built-in ProxyRotator.
  • Spiders give a Scrapy-like API with start_urls and async parse callbacks, concurrent crawling with per-domain throttling, checkpoint-based pause and resume, a streaming mode, optional robots.txt compliance, and a development cache that replays responses so you can iterate on parse() without re-hitting the target.

It also ships an MCP server and a CLI, so an LLM agent can drive it.

Install

Scrapling needs Python 3.10 or higher. The base install is parsing only:

pip install scrapling

For the browser-based fetchers you need the extra plus a one-time browser download:

pip install "scrapling[fetchers]"
scrapling install

The AI and shell features have their own extras (scrapling[ai], scrapling[shell]), and scrapling[all] pulls everything. Remember that any extra still requires scrapling install for the browser dependencies.

What the issues warn you about

The open-issue count is near zero (1 as of 2026-06), so the signal lives in the resolved threads, and they cluster around the framework’s harder features:

  • Proxy rotation has had rough edges. A discussed issue reports ProxyRotator raising NotImplementedError. If you build a crawl around rotation, validate your version against that path early.
  • The MCP integration fights stdout. A notable thread documents Scrapling’s “Downloading…” progress messages bleeding into the MCP protocol stream and corrupting communication. If you wire it to an agent over MCP, watch for noise on stdout.
  • Parser edge cases surface on real pages. An error where a TextHandlers object lacked a partition attribute is the kind of thing that only shows up against messy live HTML, which is exactly where a scraper lives.

These are the seams of an ambitious framework, not signs of neglect: the project ships often, with v0.4.9 in June 2026.

An honest note on what this is for

Scrapling leans hard into anti-bot evasion, and its README is dense with proxy-vendor sponsorships. That tells you the audience: people scraping sites that actively resist it. The framework includes optional robots.txt compliance, but using stealth fetchers to bypass protections can violate a site’s terms of service. The capability is neutral; the responsibility for staying inside the law and a site’s rules is yours.

Scrapling versus Scrapy and crawl4ai

ScraplingScrapycrawl4ai
Stars62,60762,18068,182
Focusadaptive selectors and stealthmature crawl frameworkLLM-ready extraction
LicenseBSD-3-ClauseBSD-3-ClauseApache-2.0
Anti-botbuilt-in stealth fetchersbring your ownbring your own

Counts are from GitHub as of June 2026. Scrapy is the long-established crawling framework Scrapling’s spider API echoes, but Scrapy leaves anti-bot handling to you. crawl4ai targets clean Markdown and structured output for LLM pipelines rather than resilient selection. Scrapling’s distinct bets are the adaptive parser and stealth fetchers shipped in the box.

The Markdown-and-LLM extraction angle that crawl4ai chases overlaps with MarkItDown. For what else is climbing, see the daily trending digest and the weekly report.

FAQ

What makes Scrapling “adaptive”? Elements saved with auto_save=True can be relocated with adaptive=True after a site changes its markup, so selectors survive redesigns instead of silently breaking.

Do I need a browser to use it? Only for the browser-based fetchers. Install scrapling[fetchers] and run scrapling install to download them; plain HTTP fetching and parsing work without.

Can it bypass Cloudflare? StealthyFetcher is built to pass Cloudflare Turnstile and similar challenges. Whether you should is a terms-of-service question for the target site.

Is there an agent integration? Yes, it ships an MCP server and CLI. Watch for stdout noise interfering with the MCP stream, a documented issue.