A configurable, thread-safe web crawler, provides a minimal interface for crawling and downloading web pages.
- Clean minimal API.
- Configurable: MaxDepth, MaxBodySize, Rate Limit, Parrallelism, User Agent & Proxy rotation.
- Memory-efficient, thread-safe.
- Provides built-in interface: Fetcher, Store, Queue & a Logger.
- Add support for robots.txt.
- Add test cases.
- Implement
Fetchusing Chromedp. - Add more examples.
- Add documentation.
Bugs or suggestions? Please visit the issue tracker.