kreuzcrawl, an open source Rust crawling engine with 11 language bindings

kreuzcrawl is a high-performance web crawling engine. It was designed to reliably extract structured data, operating natively across multiple languages without enforcing a specific runtime. See here: https://github.com/kreuzberg-dev/kreuzcrawl

The MCP server is integrated from the start, enabling web-crawling AI agents as a primary use case. Streaming crawl events allow real-time progress tracking. Batch operations handle hundreds of URLs concurrently and tolerate partial failures. Browser rendering supports JavaScript-heavy SPAs and includes WAF detection.

Supported language interfaces are Rust, Python, Typescript/Node.js, Go, Ruby, Java, C#, PHP, Elixir, WASM, and C FFI, and each binding connects directly to the core engine.
Kreuzcrawl is part of the Kreuzberg org: https://kreuzberg.dev/

Feedback and contributions are welcome:)

submitted by /u/Eastern-Surround7763
[link] [comments]

Leave a Comment

Your email address will not be published. Required fields are marked *

Scroll to Top