A comprehensive multi-chain ERC20 token indexing and analytics system that extracts blockchain data, processes metrics, and provides REST APIs for querying token analytics.
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ Blockchain │ │ Google Cloud │ │ PostgreSQL │
│ RPC Nodes │─── │ Storage (GCS) │─── │ Database │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│ │ │
▼ ▼ ▼
┌─────────────────┐ ┌──────────────────┐ ┌─────────────────┐
│ multichain- │ │ BigQuery │ │ erc20-metrics- │
│ orchestrator │───▶│ Data Warehouse │◄───│ cron │
└─────────────────┘ └──────────────────┘ └─────────────────┘
│ │ │
▼ │ ▼
┌─────────────────┐ │ ┌─────────────────┐
│ erc20-processor │ │ │ erc20-query-api │
└─────────────────┘ │ └─────────────────┘
│ │
▼ ▼
┌─────────────────────────────┐
│ REST API │
│ Consumers │
└─────────────────────────────┘
Purpose: Extracts raw blockchain data (blocks and ERC20 transfers) using Cryo
- Configurable chain support (Ethereum, Optimism, Arbitrum, Base)
- Parallel chunk processing
- Error handling and retry logic
- Uploads to Google Cloud Storage
Purpose: Processes raw extracted data and loads into PostgreSQL
- Transforms raw blockchain data
- Database schema management
- Batch processing for performance
Purpose: Scheduled collection of analytics and metrics from BigQuery
- Daily, hourly, and holder statistics
- Automatic synchronization from BigQuery to PostgreSQL
- Reconciliation and cleanup jobs
- Support for multiple authentication methods
Purpose: REST API for querying processed metrics and analytics
- Token metrics endpoints
- Historical data queries
- Whale transfer tracking
- Performance analytics
# Install Rust
curl --proto '=https' --tlsv1.2 -sSf https://sh.rustup.rs | sh
# Install Cryo
# Install Google Cloud SDK
curl https://sdk.cloud.google.com | bash
# Install PostgreSQL (version 14+)
# On Ubuntu/Debian:
sudo apt install postgresql postgresql-contrib
# On macOS:
brew install postgresql- Google Cloud Authentication:
# Authenticate with Google Cloud
gcloud auth login
gcloud auth application-default login
# Set your project
gcloud config set project YOUR_PROJECT_ID
# For service accounts (production):
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account-key.json"- PostgreSQL Setup:
-- Create database and user
CREATE DATABASE erc20_optimism;
CREATE USER erc20_user WITH PASSWORD 'your_password';
GRANT ALL PRIVILEGES ON DATABASE erc20_optimism TO erc20_user;- BigQuery Setup:
# Create dataset
bq mk --dataset YOUR_PROJECT_ID:erc20_multichain_india
# Create tables (run once)
cd multichain-orchestrator
./scripts/create-multichain-bq.shcd multichain-orchestrator
# Configure extraction range
# Edit configs/optimism-only.toml
[chains.optimism]
chain_id = 10
start_block = 140313367 # Start from where you left off
end_block = 140413367 # 100k new blocks
chunk_size = 100000
# Run extraction
cargo run --release --bin multichain-orchestrator configs/optimism-only.toml# Upload extracted parquet files to GCS
./scripts/upload-multichain-gcs.sh optimism
# Verify upload
gsutil ls gs://your-bucket/optimism/erc20_transfers/ | tail -5# Use the new incremental loading script
./scripts/load-multichain-incremental.sh optimism
# For dry-run to preview what will be loaded:
./scripts/load-multichain-incremental.sh --dry-run optimism
# Force full reload (careful!):
./scripts/load-multichain-incremental.sh --force-replace optimismThe erc20-metrics-cron service runs automatically on schedule:
cd erc20-metrics-cron
# Copy and edit configuration
cp config.toml.example config.toml
# Edit database connection, BigQuery project, etc.
# Run the service
cargo run --release --bin erc20-metrics-cron
# For immediate sync (don't wait for schedule):
cargo run --release --bin erc20-metrics-cron -- --reconcileFor adding new blocks to an existing chain:
# 1. Configure new block range
vim multichain-orchestrator/configs/optimism-only.toml
# 2. Extract data
cd multichain-orchestrator
cargo run --release --bin multichain-orchestrator configs/optimism-only.toml
# 3. Upload to GCS
./scripts/upload-multichain-gcs.sh optimism
# 4. Load incrementally to BigQuery
./scripts/load-multichain-incremental.sh optimism
# 5. Verify metrics sync (automatic via cron)
# Check logs: tail -f /path/to/metrics-cron.logConfiguration (configs/optimism-only.toml):
[chains.optimism]
chain_id = 10
start_block = 140313367
end_block = 140413367
chunk_size = 100000
reorg_buffer = 100
rpcs = [
{ url = "http://your-rpc-url:8545", max_requests_per_second = 1000 }
]
[orchestration]
max_parallel_chains = 1
retry_failed_chunks_interval_minutes = 15
health_check_interval_seconds = 60
checkpoint_interval_minutes = 1
max_retry_attempts = 3Key Scripts:
upload-multichain-gcs.sh- Upload parquet files to GCSload-multichain-incremental.sh- Incremental BigQuery loadingload-multichain-data.sh- Full BigQuery reload (use with caution)
Usage Examples:
# Extract 1 million blocks in 100k chunks
cargo run --release --bin multichain-orchestrator configs/optimism-only.toml
# Monitor progress
tail -f extraction_state/log_*.txt
# Check completed chunks
cat extraction_state/completed_blocks.txtPurpose: Transform and load raw blockchain data into PostgreSQL
Database Schema:
optimism_blocks- Block informationoptimism_erc20_transfers- ERC20 transfer events- Token metadata and analytics tables
Usage:
cd erc20-processor
# Process parquet files into PostgreSQL
cargo run --release
# Run with custom configuration
DATABASE_URL="postgresql://user:pass@localhost/db" cargo run --releasePurpose: Collect and sync metrics from BigQuery to PostgreSQL
Configuration (config.toml):
[bigquery]
project_id = "your-project-id"
dataset_id = "erc20_multichain_india"
[postgres]
url = "postgresql://erc20_user:password@localhost/erc20_optimism"
[schedules]
daily_metrics = "0 30 */6 * * *" # Every 6 hours
holder_stats = "0 15 */12 * * *" # Every 12 hours
hourly_patterns = "0 45 */4 * * *" # Every 4 hoursRunning Modes:
- Scheduled Service (Production):
# Start the cron service
cargo run --release --bin erc20-metrics-cron
# The service will run these jobs on schedule:
# - Daily metrics: 00:30, 06:30, 12:30, 18:30
# - Holder stats: 00:15, 12:15
# - Hourly patterns: 00:45, 04:45, 08:45, 12:45, 16:45, 20:45- Manual Reconciliation:
# Run cleanup only (doesn't sync new data)
cargo run --release --bin erc20-metrics-cron -- --reconcile- Force Initial Sync:
# Clear sync status to force re-sync
psql -d erc20_optimism -c "DELETE FROM sync_status WHERE metric_type = 'daily_metrics';"
# Restart service - it will run initial collection
cargo run --release --bin erc20-metrics-cronAuthentication Setup:
# For service account (recommended for production)
export GOOGLE_APPLICATION_CREDENTIALS="/path/to/service-account-key.json"
# For user account (development)
gcloud auth application-default login
export GOOGLE_APPLICATION_CREDENTIALS="$HOME/.config/gcloud/application_default_credentials.json"Monitored Tables:
token_daily_metrics- Daily aggregated statisticswhale_transfers- Large value transfers (>$10k)token_rankings- Daily token rankings by volume/activitysync_status- Tracks synchronization state
Purpose: REST API for querying processed ERC20 metrics
Starting the API:
cd erc20-query-api
# Start on default port 8080
cargo run --release
# Custom port and database
DATABASE_URL="postgresql://user:pass@localhost/db" PORT=3000 cargo run --releaseAPI Endpoints:
# Get daily metrics for a specific token
curl "http://localhost:8080/api/v1/optimism/tokens/0x4200000000000000000000000000000000000006/daily-metrics?days=7"
# Response format:
{
"data": [
{
"date": "2025-08-29",
"daily_active_addresses": 1234,
"daily_transactions": 5678,
"daily_transfers": 9012,
"daily_volume": "123.456789",
"daily_minted": "10.0",
"daily_burned": "5.0"
}
],
"count": 7
}# Get large transfers for a token
curl "http://localhost:8080/api/v1/optimism/tokens/0x4200000000000000000000000000000000000006/whale-transfers?limit=10&min_usd=50000"
# Response format:
{
"data": [
{
"transfer_date": "2025-08-29",
"transaction_hash": "0xabc123...",
"from_address": "0xdef456...",
"to_address": "0x789abc...",
"amount": "100.0",
"amount_usd": "150000.50"
}
],
"count": 10
}# Get daily token rankings
curl "http://localhost:8080/api/v1/optimism/rankings/daily?limit=20"
# Response format:
{
"data": [
{
"ranking_date": "2025-08-29",
"token_address": "0x4200000000000000000000000000000000000006",
"rank_by_volume": 1,
"rank_by_transactions": 1,
"volume_24h": "1000000.50",
"transactions_24h": 15000
}
],
"count": 20
}# Get hourly activity patterns
curl "http://localhost:8080/api/v1/optimism/tokens/0x4200000000000000000000000000000000000006/hourly-patterns?hours=24"
# Response format:
{
"data": [
{
"pattern_hour": "2025-08-29T10:00:00Z",
"hourly_transactions": 156,
"hourly_volume": "12345.67",
"hourly_active_addresses": 89
}
],
"count": 24
}Query Parameters:
days- Number of days of historical data (default: 30, max: 365)hours- Number of hours of historical data (default: 24, max: 168)limit- Maximum number of results (default: 100, max: 1000)offset- Results offset for pagination (default: 0)min_usd- Minimum USD value for whale transfers (default: 10000)
Health Check:
curl http://localhost:8080/health
# Response: {"status": "healthy"}- Monitor Data Freshness:
# Check latest block in BigQuery
bq query --use_legacy_sql=false "SELECT MAX(block_number) FROM \`project.dataset.erc20_transfers_optimism\`"
# Check sync status
psql -d erc20_optimism -c "SELECT * FROM sync_status ORDER BY updated_at DESC LIMIT 5;"- Monitor Service Health:
# Check if metrics cron is running
ps aux | grep erc20-metrics-cron
# Check API health
curl http://localhost:8080/health
# View recent logs
tail -f /path/to/logs/erc20-metrics-cron.log- Update Configuration:
# Add to erc20-metrics-cron/config.toml
[[tokens.optimism]]
address = "0xNewTokenAddress"
symbol = "NEW"
name = "New Token"
decimals = 18
priority = 5- Clear Sync Status (to force historical sync):
-- This will make the cron collect data from day 1 for the new token
DELETE FROM sync_status
WHERE token_address = '0xNewTokenAddress';- Restart Service:
# Restart metrics cron to pick up new token
pkill erc20-metrics-cron
cargo run --release --bin erc20-metrics-cronThe recommended workflow for adding new blocks to an existing chain:
- Update Configuration:
# Edit configs/optimism-only.toml
# Set start_block to where you left off
# Set end_block to the new target- Extract and Upload:
# Extract new data
cargo run --release --bin multichain-orchestrator configs/optimism-only.toml
# Upload to GCS
./scripts/upload-multichain-gcs.sh optimism- Load Incrementally:
# Only loads new blocks, won't duplicate existing data
./scripts/load-multichain-incremental.sh optimism- Verify:
# Check block range in BigQuery
bq query --use_legacy_sql=false "SELECT MIN(block_number), MAX(block_number), COUNT(*) FROM \`project.dataset.erc20_transfers_optimism\`"Authentication Issues:
# Check current auth
gcloud auth list
# Refresh application default credentials
gcloud auth application-default login
# For service account issues
gcloud iam service-accounts keys list --iam-account=your-service-account@project.iam.gserviceaccount.comBigQuery Query Failures:
# Check permissions
bq ls --project_id=your-project-id
# Test simple query
bq query --use_legacy_sql=false "SELECT 1"Database Connection Issues:
# Test PostgreSQL connection
psql -h localhost -U erc20_user -d erc20_optimism -c "SELECT 1;"
# Check running processes
sudo netstat -tlnp | grep :5432Incremental Loading Issues:
# Debug what would be loaded
./scripts/load-multichain-incremental.sh --dry-run optimism
# Check for overlapping files
./scripts/load-multichain-incremental.sh --verbose optimism- Parallel Processing:
# Increase parallel chains for multi-chain setups
max_parallel_chains = 4
# Optimize chunk size based on chain
chunk_size = 100000 # Good for Optimism
chunk_size = 1000 # Better for Ethereum mainnet- RPC Optimization:
# Use multiple RPC endpoints
rpcs = [
{ url = "http://rpc1:8545", max_requests_per_second = 1000 },
{ url = "http://rpc2:8545", max_requests_per_second = 1000 }
]- BigQuery Optimization:
# Use clustering for better query performance
bq mk --table --clustering_fields=block_number,erc20 project:dataset.table# Clone the repository
git clone <repository-url>
cd erc20-indexing
# Build all components
cargo build --release