Thanks to visit codestin.com
Credit goes to github.com

Skip to content

Snap Kit is a powerful, Puppeteer-based service designed for seamless web automation. It enables you to effortlessly capture high-fidelity web page screenshots and efficiently scrape web content for precise data extraction.

blocklet/snap-kit

Repository files navigation

πŸ“Έ Snap Kit

Snap Kit Logo

Enterprise-grade web automation platform powered by Puppeteer

License Version Node.js TypeScript

Features β€’ Quick Start β€’ Architecture β€’ API Reference β€’ Examples


πŸš€ Why Snap Kit?

Snap Kit is a production-ready web automation platform that transforms how you handle web scraping, screenshot generation, and SEO optimization. Built on the powerful Blocklet ecosystem, it delivers enterprise-grade reliability with developer-friendly APIs.

πŸ’‘ Key Benefits

  • Zero Configuration: Deploy instantly with Docker or Blocklet Server
  • Production Scale: Handle thousands of concurrent requests with built-in queuing
  • SEO Powerhouse: Pre-render SPAs for perfect search engine indexing
  • Developer Experience: Modern TypeScript APIs with comprehensive documentation
  • Cost Effective: Self-hosted solution with no per-request fees

✨ Features

🎯 Core Capabilities

  • High-Fidelity Screenshots: Capture pixel-perfect web page snapshots
  • Smart Content Extraction: Extract structured data with advanced parsing
  • Batch Processing: Process multiple URLs efficiently with queue management
  • Sitemap Crawling: Automatically discover and crawl entire websites
  • SEO Pre-rendering: Generate search-engine-friendly HTML for SPAs
  • Custom Headers & Cookies: Full control over request customization

πŸ”§ Technical Highlights

  • Puppeteer Integration: Latest Chrome automation capabilities
  • SQLite Database: Efficient data storage with Sequelize ORM
  • React 19.1 UI: Modern, responsive web interface
  • Express 4.21 API: RESTful endpoints with TypeScript
  • Blocklet Platform: One-click deployment and scaling

πŸ“¦ What's Included

This monorepo contains three production-ready modules:

πŸ—οΈ Snap Kit Blocklet

Location: blocklets/snap-kit

The main application featuring:

  • React Frontend: Modern UI for managing crawling tasks
  • Express API: RESTful endpoints for automation
  • DID Authentication: Secure access control
  • Real-time Dashboard: Monitor crawling progress

πŸ•·οΈ Crawler Engine

Location: packages/crawler

Core automation engine with:

  • Puppeteer Integration: Latest Chrome automation
  • Database Management: SQLite with migrations
  • Queue System: Efficient batch processing
  • Scheduled Tasks: Automated crawling workflows

🌐 SEO Middleware

Location: packages/middleware

Express middleware for:

  • Pre-rendering: Generate static HTML for SPAs
  • Cache Management: Intelligent caching strategies
  • Search Engine Optimization: Perfect SEO for dynamic content

πŸš€ Quick Start

Prerequisites

  • Node.js 18+
  • pnpm (recommended) or npm
  • Docker (optional)

Installation

# Clone the repository
git clone https://github.com/blocklet/snap-kit.git
cd snap-kit

# Install dependencies
pnpm install

# Start development environment
pnpm dev

Docker Deployment

# Build and run with Docker
docker run -p 3000:3000 arcblock/snap-kit

Blocklet Server

# Deploy to Blocklet Server
npm run deploy

πŸ—οΈ Architecture

β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”    β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                     β”‚    β”‚                     β”‚    β”‚                     β”‚
β”‚   React Frontend    │───▢│   Express API       │───▢│   Crawler Engine    β”‚
β”‚                     β”‚    β”‚                     β”‚    β”‚                     β”‚
β”‚  β€’ Modern UI        β”‚    β”‚  β€’ RESTful API      β”‚    β”‚  β€’ Puppeteer        β”‚
β”‚  β€’ Real-time        β”‚    β”‚  β€’ Authentication   β”‚    β”‚  β€’ Queue System     β”‚
β”‚  β€’ Dashboard        β”‚    β”‚  β€’ Rate Limiting    β”‚    β”‚  β€’ SQLite Storage   β”‚
β”‚                     β”‚    β”‚                     β”‚    β”‚                     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜    β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

Technology Stack

  • Frontend: React 19.1, TypeScript, Vite 7.0
  • Backend: Express 4.21, TypeScript, DID Auth
  • Database: SQLite with Sequelize ORM
  • Automation: Puppeteer, @blocklet/puppeteer
  • Deployment: Blocklet Platform, Docker

πŸ“š API Reference

Screenshot Generation

POST /api/screenshot
{
  "url": "https://example.com",
  "options": {
    "width": 1920,
    "height": 1080,
    "fullPage": true
  }
}

Content Extraction

POST /api/extract
{
  "url": "https://example.com",
  "selectors": {
    "title": "h1",
    "description": "meta[name='description']"
  }
}

Batch Processing

POST /api/crawl/batch
{
  "urls": ["https://site1.com", "https://site2.com"],
  "options": {
    "priority": "high",
    "schedule": "immediate"
  }
}

🎯 Use Cases

πŸ“Š Website Monitoring

Monitor competitor websites and track changes automatically.

πŸ” SEO Optimization

Pre-render SPAs for perfect search engine indexing.

πŸ“ˆ Data Analytics

Extract structured data from websites for business intelligence.

πŸ–ΌοΈ Visual Testing

Generate screenshots for visual regression testing.

πŸ“± Social Media

Create automated social media preview generation.

πŸ› οΈ Development

Commands

# Development
pnpm dev                    # Start all services
pnpm build:packages        # Build all packages
pnpm lint                  # Lint all packages
pnpm lint:fix              # Fix lint issues

# Snap Kit Blocklet
cd blocklets/snap-kit
npm run dev                # Development server
npm run bundle             # Production build
npm run deploy             # Deploy to Blocklet Server

Project Structure

snap-kit/
β”œβ”€β”€ blocklets/snap-kit/     # Main Blocklet application
β”‚   β”œβ”€β”€ src/               # React frontend
β”‚   β”œβ”€β”€ api/               # Express API
β”‚   └── public/            # Static assets
β”œβ”€β”€ packages/
β”‚   β”œβ”€β”€ crawler/           # Core crawler engine
β”‚   └── middleware/        # SEO middleware
└── scripts/               # Build and utility scripts

πŸ”’ Security

  • DID Authentication: Secure decentralized identity
  • Rate Limiting: Prevent abuse and ensure fair usage
  • Input Validation: Comprehensive request sanitization
  • CORS Configuration: Secure cross-origin requests

πŸ“– Documentation

🀝 Contributing

We welcome contributions! Please see our Contributing Guide for details.

πŸ“„ License

MIT License - see LICENSE for details.

🌟 Support


Built with ❀️ by the ArcBlock team
ArcBlock β€’ Blocklet β€’ GitHub

About

Snap Kit is a powerful, Puppeteer-based service designed for seamless web automation. It enables you to effortlessly capture high-fidelity web page screenshots and efficiently scrape web content for precise data extraction.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •