Thanks to visit codestin.com
Credit goes to github.com

Skip to content

ccomb/fplca

Repository files navigation

fplca

fplca is a high-performance Life Cycle Assessment (LCA) engine written in Haskell. It runs entirely in memory and provides both a command-line interface and REST API for LCA computations.

Features

  • 📊 LCA Computations: Complete life cycle inventory (LCI) and impact assessment (LCIA)
  • 🌳 Supply Chain Trees: Recursive dependency tree visualization and analysis
  • 🔍 Search & Discovery: Find activities and flows across LCA databases
  • 📈 Multiple Output Formats: JSON, CSV with JSONPath extraction, Pretty tables
  • 🚀 High Performance: In-memory processing with PETSc linear algebra
  • 🌐 REST API: HTTP server with comprehensive endpoints
  • 💾 Smart Caching: Automatic caching for fast repeated queries
  • 📋 EcoSpold2 Support: Native .spold file format parsing (Ecoinvent)

Overview & Capabilities

Core LCA Functions

  • Load EcoSpold2 databases (.spold files from Ecoinvent)
  • Build process dependency trees with circular dependency handling
  • Compute life cycle inventories (LCI) via matrix operations
  • Apply characterization methods for impact assessment (LCIA)
  • Export results in multiple formats (ILCD XML, CSV, JSON)

Advanced Features

  • Matrix-based computation using PETSc for efficient large-scale solving
  • Automatic loop detection in supply chains
  • Flexible tree depth limits for performance optimization
  • Multi-threaded processing with automatic CPU core detection
  • Database quality validation with comprehensive health checks

Command Reference

Global Options

Option Description Example
--data PATH Data directory containing .spold files --data ./ECOINVENT3.9.1
--format FORMAT Output format: json|csv|table|pretty --format json
--jsonpath PATH JSONPath for CSV extraction (required with CSV) --jsonpath "srResults"
--tree-depth N Maximum tree depth for calculations --tree-depth 3
--no-cache Disable caching (for development) --no-cache

Commands

🔍 Search Commands

Search Activities

# Find activities by name
fplca --data ./data activities --name "electricity"

# Geographic filtering
fplca activities --name "transport" --geo "DE" --limit 5

# Product-based search
fplca activities --product "steel" --limit 10 --offset 20

Search Flows

# Find flows by keyword
fplca flows --query "carbon dioxide" --limit 5

# Language-specific search
fplca flows --query "CO2" --lang "en" --limit 10

📊 Activity Analysis

Activity Information

# Get complete activity details
fplca activity "12345678-1234-1234-1234-123456789abc"

Supply Chain Tree

# Build dependency tree (default depth: 2)
fplca tree "12345678-1234-1234-1234-123456789abc"

# Custom tree depth
fplca --tree-depth 4 tree "12345678-1234-1234-1234-123456789abc"

Life Cycle Inventory

# Compute full inventory
fplca inventory "12345678-1234-1234-1234-123456789abc"

🧮 Impact Assessment

LCIA Computation

# Compute impacts with method file
fplca lcia "12345678-1234-1234-1234-123456789abc" \
  --method "./methods/PEF_v3.1.xml"

# Export to XML and CSV
fplca lcia "12345678-1234-1234-1234-123456789abc" \
  --method "./methods/PEF_v3.1.xml" \
  --output results.xml \
  --csv results.csv

🔧 Matrix Export & Debugging

Export Matrices (Ecoinvent Universal Format)

# Export full database matrices in universal format
fplca export-matrices ./output_dir

Debug Matrices

# Export targeted matrix slices for debugging
fplca debug-matrices "12345678-1234-1234-1234-123456789abc" \
  --output ./debug_output

🌐 API Server

Start Server

# Start on default port (8080)
fplca --data ./data server

# Custom port
fplca --data ./data server --port 3000

# With password protection (HTTP Basic Auth)
fplca --data ./data server --password mysecret
# Or via environment variable
FPLCA_PASSWORD=mysecret fplca --data ./data server

# Web interface available at http://localhost:8080/
# API endpoints at http://localhost:8080/api/v1/

Output Formats

JSON Format

fplca --format json activities --limit 2
{
  "srResults": [
    {"prsId": "12345...", "prsName": "electricity production", "prsLocation": "DE"},
    {"prsId": "67890...", "prsName": "transport by truck", "prsLocation": "EU"}
  ],
  "srTotal": 156,
  "srLimit": 2
}

Pretty Format (Default)

fplca activities --limit 2  # Uses pretty format by default
{
    "srResults": [
        {
            "prsId": "12345678-1234-1234-1234-123456789abc",
            "prsLocation": "DE",
            "prsName": "electricity production"
        }
    ],
    "srTotal": 156
}

CSV Format with JSONPath

⚠️ Important: CSV format requires --jsonpath to specify which data to extract

Extract Search Results

fplca --format csv --jsonpath "srResults" activities --limit 5
prsId,prsLocation,prsName
12345678-1234-1234-1234-123456789abc,DE,electricity production
87654321-4321-4321-4321-cba987654321,FR,transport by truck

Extract Activity Exchanges

fplca --format csv --jsonpath "piActivity.pfaExchanges" activity "12345..."
ewuFlowName,ewuFlowCategory,ewuUnitName,ewuExchange.techAmount,ewuExchange.techIsInput
electricity,technosphere,kWh,1.0,false
natural gas,technosphere,m3,2.5,true
carbon dioxide,air,kg,0.85,false

Extract Tree Edges

fplca --format csv --jsonpath "teEdges" tree "12345..."
teFlow.fiName,teFlow.fiCategory,teFrom,teTo,teQuantity,teUnit
electricity,technosphere,12345...,67890...,1.0,kWh
natural gas,technosphere,67890...,11111...,2.5,m3

Extract Inventory Flows

fplca --format csv --jsonpath "ieFlows" inventory "12345..."
ifdFlow.flowName,ifdFlow.flowCategory,ifdQuantity,ifdUnitName,ifdIsEmission
carbon dioxide,air,1.25,kg,true
methane,air,0.02,kg,true
coal,natural resources,-2.5,kg,false

JSONPath Reference

Command Recommended JSONPath Extracts
activities srResults Activity search results
flows srResults Flow search results
activity <uuid> piActivity.pfaExchanges Activity exchanges
tree <uuid> teEdges Supply chain connections
inventory <uuid> ieFlows Environmental flows

REST API Endpoints

When running in server mode, the following endpoints are available:

Activities

  • GET /api/v1/activities - Search activities
  • GET /api/v1/activity/{uuid} - Get activity details
  • GET /api/v1/activity/{uuid}/tree - Get supply chain tree
  • GET /api/v1/activity/{uuid}/inventory - Get life cycle inventory

Flows

  • GET /api/v1/flows - Search flows
  • GET /api/v1/flow/{uuid} - Get flow details

Impact Assessment

  • POST /api/v1/lcia/{uuid} - Compute LCIA with method

Example API Usage

# Start server
fplca --data ./ECOINVENT3.9.1 server --port 8080

# Search activities
curl "http://localhost:8080/api/v1/activities?name=electricity&limit=5"

# Get activity tree
curl "http://localhost:8080/api/v1/activity/12345.../tree?depth=3"

# Compute inventory
curl "http://localhost:8080/api/v1/activity/12345.../inventory"

Usage Examples

Basic Workflow

# 1. Search for an activity
fplca --data ./ECOINVENT3.9.1 activities --name "electricity production" --geo "DE" --limit 1

# 2. Get the activity UUID from results, then analyze its supply chain
fplca tree "12345678-1234-1234-1234-123456789abc"

# 3. Compute environmental inventory
fplca inventory "12345678-1234-1234-1234-123456789abc"

# 4. Export tree edges to CSV for further analysis
fplca --format csv --jsonpath "teEdges" tree "12345678-1234-1234-1234-123456789abc" > supply_chain.csv

Data Analysis Workflows

Export Activity Network to CSV

# Get all exchanges for detailed analysis
fplca --format csv --jsonpath "piActivity.pfaExchanges" \
  activity "12345678-1234-1234-1234-123456789abc" > exchanges.csv

Inventory Analysis

# Extract biosphere flows for impact assessment
fplca --format csv --jsonpath "ieFlows" \
  inventory "12345678-1234-1234-1234-123456789abc" > inventory.csv

Multi-format Output

# JSON for programmatic use
fplca --format json inventory "12345..." > inventory.json

# CSV for spreadsheet analysis
fplca --format csv --jsonpath "ieFlows" inventory "12345..." > inventory.csv

# Pretty format for human reading
fplca --format pretty inventory "12345..."

Performance Optimization

Caching

# First run builds cache (slower)
fplca --data ./ECOINVENT3.9.1 activities --name "steel"

# Subsequent runs use cache (much faster)
fplca --data ./ECOINVENT3.9.1 activities --name "aluminum"

# Disable cache for development
fplca --no-cache --data ./ECOINVENT3.9.1 activities --name "cement"

Tree Depth Control

# Fast shallow analysis
fplca --tree-depth 1 tree "12345..."

# Comprehensive deep analysis (slower but complete)
fplca --tree-depth 5 tree "12345..."

Data Quality & Validation

The engine performs automatic database validation:

  • Orphaned flows check: Ensures all flows are properly linked
  • Reference products: Validates all activities have reference outputs
  • Exchange balance: Checks input/output ratios for sanity
  • Geographic coverage: Reports available locations
  • Unit consistency: Validates unit relationships
  • Matrix properties: Analyzes sparsity and conditioning

Sample Validation Output

Database Quality Validation:
  ✓ No orphaned flows found
  ✓ All activities have reference products
  ✓ Average exchange balance: 2.8:1 (outputs:inputs)
  ✓ Database quality: Excellent

Performance Characteristics:
  Matrix density: 12.4% (very sparse - excellent for performance)
  Total matrix entries: 2,847,392 non-zero values
  Solver complexity: O(n^1.5) for 15,842 activities
  Expected solve time: ~15-25 seconds (MUMPS direct solver)

Performance & Scaling

Memory Management

  • In-memory processing: Entire database loaded into RAM for speed
  • Smart caching: Automatic disk cache (~30MB compressed) for instant database reload
  • Matrix optimization: Unboxed sparse matrix storage with minimal overhead
  • Memory optimizations: UUID interning, unboxed vectors, strict evaluation
  • Configurable limits: GHC RTS options to control memory usage (see below)

Computation Performance

  • PETSc integration: High-performance linear algebra via MUMPS solver
  • Multi-threading: Automatic CPU core detection and utilization
  • Matrix precomputation: Pre-factored matrices for fast repeated solves
  • Depth limiting: Configurable tree depth to balance accuracy vs speed

Scaling Guidelines

Database Size Live Data Cache Load (RSS) Cold Start Cache Time Tree Solve
Sample (3 activities) ~10 MB ~50 MB ~1s < 0.1s < 100ms
EcoInvent 3.11 (25k activities) ~305 MB ~500 MB* ~45s ~0.5s 5-15s
Custom large DB (50k+ activities) ~600 MB ~1 GB* ~120s ~1s 15-60s

*With GHC RTS heap cap (+RTS -M1G). Without cap, GHC may allocate 5-7GB arena but only use ~300MB live data.

Memory Control with GHC RTS Options

You can control memory usage using GHC runtime system (RTS) options. Add them after your command:

Cache Load (Memory-Efficient)

# Limit heap to 800MB for shared systems
fplca --data ./data activities --limit 5 +RTS -M800M -H256M -A16M -c -RTS

Cold Start (Performance)

# Allow larger heap for initial parsing
fplca --no-cache --data ./data activities +RTS -M5G -H2G -A64M -RTS

Web Server (Balanced)

# Moderate heap with automatic garbage collection
fplca --data ./data server +RTS -M1G -H512M -A16M -c -I30 -RTS

RTS Options Explained:

  • -M<size>: Maximum heap size (hard cap, prevents excessive memory use)
  • -H<size>: Initial heap size (pre-allocates for performance)
  • -A<size>: Allocation area size (nursery for young generation GC)
  • -c: Enable compacting GC (reduces memory fragmentation)
  • -I<sec>: Idle GC interval (forces cleanup during idle periods)

Why use RTS options?

  • GHC pre-allocates large heap (~5-7GB) by default for performance
  • Actual live data is much smaller (~300MB for EcoInvent 3.11)
  • RTS caps prevent memory waste on shared systems
  • No performance penalty when caps are reasonable

Error Handling & Troubleshooting

Common Issues

CSV Format Errors

# ❌ Error: CSV requires JSONPath
fplca --format csv activities

# ✅ Correct: Specify JSONPath
fplca --format csv --jsonpath "srResults" activities

Invalid JSONPath

# ❌ Error: Path not found
fplca --format csv --jsonpath "invalidPath" activities
# Output: Error extracting JSONPath 'invalidPath': Path component 'invalidPath' not found

# ✅ Correct paths for each command
fplca --format csv --jsonpath "srResults" activities      # Search results
fplca --format csv --jsonpath "teEdges" tree "uuid"       # Tree edges
fplca --format csv --jsonpath "ieFlows" inventory "uuid"  # Inventory flows

Memory Issues

# If running out of memory, cap the heap with RTS options
fplca inventory "uuid" +RTS -M1G -RTS

# Or reduce tree depth for complex calculations
fplca --tree-depth 1 inventory "uuid"

# Or disable caching during development
fplca --no-cache inventory "uuid"

Validation Errors

The CLI validates options before execution:

  • --format csv requires --jsonpath
  • --jsonpath can only be used with --format csv
  • Invalid UUIDs are caught early with helpful messages

License

GNU AFFERO GENERAL PUBLIC LICENSE 3.0 or later - See LICENSE file for details.


Contributing & Development

This is a research/educational tool designed for LCA practitioners and developers. The codebase prioritizes:

  • Correctness: Rigorous LCA methodology implementation
  • Performance: Efficient large-scale computation
  • Usability: Clear CLI interface and comprehensive documentation
  • Extensibility: Modular design for easy feature addition

For development setup, see CLAUDE.md for detailed build instructions and architecture overview.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Packages

No packages published