LogoIndieHub
icon of WaterCrawl

WaterCrawl

WaterCrawl transforms web content into LLM-ready data with Python, Django, Scrapy, and Celery, offering advanced crawling and a REST API.

Introduction

WaterCrawl

πŸ•·οΈ WaterCrawl is a powerful web application that uses Python, Django, Scrapy, and Celery to crawl web pages and extract relevant data.

✨ Features
  • πŸ•ΈοΈ Advanced Web Crawling & Scraping - Crawl websites with highly customizable options for depth, speed, and targeting specific content
  • πŸ” Powerful Search Engine - Find relevant content across the web with multiple search depths (basic, advanced, ultimate)
  • 🌐 Multi-language Support - Search and crawl content in different languages with country-specific targeting
  • ⚑ Asynchronous Processing - Monitor real-time progress of crawls and searches via Server-Sent Events (SSE)
  • πŸ”„ REST API with OpenAPI - Comprehensive API with detailed documentation and client libraries
  • πŸ”Œ Rich Ecosystem - Integrations with Dify, N8N, and other AI/automation platforms
  • 🏠 Self-hosted & Open Source - Full control over your data with easy deployment options
  • πŸ“Š Advanced Results Handling - Download and process search results with customizable parameters

Check our API Overview to learn more about these features.

πŸ› οΈ Client SDKs
  • βœ… Python Client - Full-featured SDK with support for all API endpoints
  • βœ… Node.js Client - Complete JavaScript/TypeScript integration
  • βœ… Go Client - Full-featured SDK with support for all API endpoints
  • βœ… PHP Client - Full-featured SDK with support for all API endpoints
  • πŸ”œ Rust Client - Coming soon
πŸ”Œ Integrations