Tip
Live Demo: https://metadata-quality.mjanez.dev/
A modern web application for evaluating RDF metadata quality based on FAIR+C principles, built with React + TypeScript.
Tip
For Docker Compose deployment with backend support see: Docker
- Complete MQA evaluation with real metrics for DCAT-AP, DCAT-AP-ES and NTI-RISP
- Data Quality Analysis ISO/IEC 25012-based assessment for
CSV/JSONdistributions (Only backend enabled) - Multi-format support
RDF/XML,Turtle,JSON-LD,N-Triples,GeoJSONwith auto-detection - Remote URL processing to validate online datasets (Full option only if backend enabled)
- SPARQL endpoint integration with predefined queries for data portals
- Dashboard for monitoring and managing metadata quality results.
- Interactive visualization with FAIR+C radar charts and detailed tables
- Controlled vocabularies integrated (formats, licenses, access rights, etc.) from data.europa.eu
- Responsive interface with Bootstrap 5 and modern components
- Full TypeScript for safe and maintainable development
- Tech Stack
- Quick Start
- Configuration
- Development
- Deployment
- Architecture
- Internationalization
- Theming
- Troubleshooting
| Technology | Version | Purpose |
|---|---|---|
| React | 19.1.10 | UI framework with modern hooks |
| TypeScript | 4.9.5 | Static typing and safe development |
| N3.js | 1.26.0 | RDF parsing and manipulation |
| rdfxml-streaming-parser | 3.1.0 | RDF/XML → Turtle conversion |
| shacl-engine | 1.0.2 | A fast RDF/JS SHACL engine includes SHACL SPARQL-based Constraints |
| Bootstrap | 5.3.7 | Responsive CSS framework |
| Chart.js | 4.5.0 | Radar charts visualization |
| react-i18next | Latest | Internationalization support |
| gh-pages | 6.3.0 | Automated GitHub Pages deployment |
Prerequisites: Docker and Docker Compose installed
# Clone repository
git clone https://github.com/mjanez/metadata-quality-react.git
cd metadata-quality-react
# Start with pre-built image from GHCR
docker compose up -d
# Or build locally
IMAGE_TAG=local docker compose up -d --buildTip
Application will be available at: https://localhost:443 (HTTP auto-redirects to HTTPS)
- Frontend: https://localhost:443
- Backend API: https://localhost:443/api/health
Prerequisites: Node.js >= 16.x, npm >= 8.x
# Quick start (both frontend and backend)
./dev-start.sh
# Or manually:
# Install dependencies
npm install
# Start development server (frontend only)
npm start
# Optional: Start backend server (in separate terminal)
cd backend
npm i && npm startTip
Development servers:
- Frontend: http://localhost:3000
- Backend API: http://localhost:3001/api/health
The application is configured through the src/config/mqa-config.json file, which centralizes all settings for profiles, metrics, and SPARQL endpoints. This configuration file follows a structured approach to support multiple metadata standards and quality assessment methodologies.
Profiles define the metadata standards (DCAT-AP, DCAT-AP-ES, NTI-RISP) with their specific versions and validation rules.
{
"profiles": {
"profile_id": {
"versions": {
"version_number": {
"name": "Display Name",
"maxScore": 405,
"icon": "img/icons/icon.svg",
"url": "https://profile-documentation-url",
"sampleUrl": "https://sample-data-url",
"shaclFiles": [
"https://shacl-validation-file-1.ttl",
"https://shacl-validation-file-2.ttl"
],
"dimensions": {
"findability": { "maxScore": 100 },
"accessibility": { "maxScore": 100 },
"interoperability": { "maxScore": 110 },
"reusability": { "maxScore": 75 },
"contextuality": { "maxScore": 20 }
}
}
},
"defaultVersion": "version_number"
}
}
}- Create the profile structure in
mqa-config.json:
"my_custom_profile": {
"versions": {
"1.0.0": {
"name": "My Custom Profile 1.0.0",
"maxScore": 400,
"icon": "img/icons/custom.svg",
"url": "https://my-profile-docs.com",
"sampleUrl": "https://my-sample-data.ttl",
"shaclFiles": [
"https://my-shacl-validation.ttl"
],
"dimensions": {
"findability": { "maxScore": 100 },
"accessibility": { "maxScore": 100 },
"interoperability": { "maxScore": 100 },
"reusability": { "maxScore": 75 },
"contextuality": { "maxScore": 25 }
}
}
},
"defaultVersion": "1.0.0"
}- Add corresponding metrics in the
profile_metricssection - Add icon file to
public/img/icons/ - Update translations in
public/locales/en/translation.jsonandpublic/locales/es/translation.json
Metrics define how quality is measured for each FAIR+C dimension. Each metric has an ID, weight, and associated RDF property.
{
"profile_metrics": {
"profile_id": {
"dimension_name": [
{
"id": "metric_identifier",
"weight": 30,
"property": "rdf:property"
}
]
}
}
}- Define the metric in the appropriate profile and dimension:
"findability": [
{
"id": "my_custom_metric",
"weight": 25,
"property": "my:customProperty"
}
]- Add metric labels to translation files:
{
"metrics": {
"specific": {
"my_custom_metric": "My Custom Metric"
}
}
}- Implement evaluation logic in
src/services/MQAService.tsif needed
| Dimension | Code | Description | Typical Metrics |
|---|---|---|---|
| Findability | findability |
How easily the dataset can be found | Keywords, themes, spatial/temporal coverage |
| Accessibility | accessibility |
How accessible the data is | Access URLs, download URLs, status checks |
| Interoperability | interoperability |
Technical interoperability | Formats, media types, standards compliance |
| Reusability | reusability |
How easily the data can be reused | Licenses, access rights, contact information |
| Contextuality | contextuality |
Contextual information provided | Size, dates, rights information |
The SPARQL configuration enables integration with data portals and endpoints for automated data retrieval and validation.
{
"sparql_config": {
"default_endpoint": "https://sparql-endpoint-url",
"queries": {
"profile_id": [
{
"id": "query_identifier",
"name": "Human-readable name",
"description": "Query description",
"query": "SPARQL query string with {parameter} placeholders",
"parameters": [
{
"name": "parameter_name",
"label": "Parameter Label",
"type": "text",
"required": true,
"placeholder": "Enter value...",
"description": "Parameter description"
}
]
}
]
}
}
}- Define the query for a specific profile:
"my_profile": [
{
"id": "my_custom_query",
"name": "Custom Data Query",
"description": "Retrieves custom dataset information",
"query": "PREFIX dcat: <http://www.w3.org/ns/dcat#>\nCONSTRUCT {\n ?dataset a dcat:Dataset ;\n dct:title ?title .\n}\nWHERE {\n ?dataset a dcat:Dataset ;\n dct:publisher ?publisher ;\n dct:title ?title .\n FILTER (regex(str(?publisher), \"{org_id}\", \"i\"))\n}\nLIMIT {limit}",
"parameters": [
{
"name": "org_id",
"label": "Organization ID",
"type": "text",
"required": true,
"placeholder": "e.g., ministry-of-health",
"description": "Identifier of the organization"
},
{
"name": "limit",
"label": "Result Limit",
"type": "number",
"required": false,
"placeholder": "50",
"description": "Maximum number of results"
}
]
}
]-
Parameter Types Available:
text: Text inputnumber: Numeric inputselect: Dropdown (requiresoptionsarray)textarea: Multi-line text
-
Query Features:
- Parameter substitution: Use
{parameter_name}in queries - CONSTRUCT queries: Preferred for generating valid RDF
- Endpoint testing: Use debug queries to test connectivity
- Parameter substitution: Use
Special debug queries help test endpoint connectivity:
"debug": [
{
"id": "test_endpoint",
"name": "Test Endpoint",
"description": "Verify endpoint connectivity",
"query": "SELECT * WHERE { ?s ?p ?o } LIMIT 10",
"parameters": []
}
]- Profile Naming: Use consistent IDs (
dcat_ap,dcat_ap_es,dcat_ap_es_hvdnti_risp) - Version Management: Support multiple versions per profile
- Metric Weights: Ensure weights sum to reasonable totals per dimension
- SPARQL Queries: Use CONSTRUCT queries for better RDF generation
- Parameter Validation: Provide clear descriptions and examples
- Icon Management: Store icons in
public/img/icons/as SVG - Translation Keys: Keep metric IDs consistent across profiles
| Script | Command | Description |
|---|---|---|
| Development (Full) | ./dev-start.sh |
Start both frontend and backend servers |
| Development (Frontend) | npm start |
Local server with hot reload (frontend only) |
| Cleanup | ./dev-cleanup.sh |
Stop all development servers and clean ports |
| Build | npm run build |
Optimized production build |
| Deploy | npm run deploy |
Automatic deploy to GitHub Pages |
| Test | npm test |
Run tests (if any) |
react-app/
├── public/
│ ├── data/ # JSONL vocabularies
│ │ ├── access_rights.jsonl
│ │ ├── file_types.jsonl
│ │ ├── licenses.jsonl
│ │ └── ...
│ ├── locales/ # i18n translations
│ │ ├── en/translation.json # English translations + metrics labels
│ │ └── es/translation.json # Spanish translations + metrics labels
│ └── img/icons/ # Profile icons
├── src/
│ ├── components/ # React components
│ │ ├── ValidationForm.tsx # Input form + SPARQL integration
│ │ ├── ValidationResults.tsx # Results and charts
│ │ ├── QualityChart.tsx # FAIR+C radar chart
│ │ └── ...
│ ├── services/ # Business logic
│ │ ├── MQAService.ts # Main MQA engine + metric evaluation
│ │ ├── SPARQLService.ts # SPARQL endpoint integration
│ │ └── RDFService.ts # RDF processing
│ ├── config/ # Configuration
│ │ └── mqa-config.json # **Central configuration file**
│ ├── types/ # TypeScript types
│ └── i18n/ # Internationalization setup
└── package.json
This application can be deployed on multiple platforms with different configurations.
Important
Automatic Configuration for Docker Deployments
When using Docker Compose, the container automatically enables backend features on startup:
- Detects if
backend_server.enabledordata_quality.enabledarefalse - Automatically updates configuration to
truefor Docker deployment - Creates backup of original config as
mqa-config.json.bak - No manual configuration needed! 🎉
Manual Configuration (only if deploying to platforms without backend):
| Platform | backend_server.enabled |
data_quality.enabled |
Auto-configured? |
|---|---|---|---|
| Docker Compose | true |
true |
✅ Yes (automatic) |
| GitHub Pages | false |
false |
❌ No (manual) |
| Local Development | true |
true |
✅ Yes (via dev-start.sh) |
For GitHub Pages deployment:
# Disable backend features
sed -i 's/"enabled": true/"enabled": false/g' src/config/mqa-config.json
npm run deploy| Platform | Frontend | Backend | Auto HTTPS | Free Tier | CI/CD | Backend Config | Data Quality | Best For |
|---|---|---|---|---|---|---|---|---|
| Docker | ✅ | ✅ (Express) | ⚙️ | - | ⚙️ | enabled: true |
✅ Full analysis | Self-hosted (Full control) |
| GitHub Pages | ✅ | ❌ | ✅ | ✅ | ✅ | enabled: false |
❌ Limited | Demo/Docs (No backend) |
Features: Full control, both frontend and backend, custom domain support
Note
Docker Configuration: For Docker deployment, backend features should be enabled in mqa-config.json:
{
"backend_server": {
"enabled": true,
"url": ""
},
"data_quality": {
"enabled": true,
"require_backend": true
}
}# 1. Clone and configure
git clone https://github.com/mjanez/metadata-quality-react.git
cd metadata-quality-react
cp .env.example .env
# 2. Start with pre-built image (recommended)
docker compose up -d
# Or build locally
IMAGE_TAG=local docker compose up -d --build
# 3. Access application
# Frontend: https://localhost:443
# Backend API: https://localhost:443/api/healthNote
By default, uses the latest stable image from GHCR. Set IMAGE_TAG=develop for development version or IMAGE_TAG=local to build locally.
| Mode | Command | Use Case | Features |
|---|---|---|---|
| Production (GHCR) | docker compose up -d |
Self-hosted with pre-built image | Fast deployment, auto-updates, SSL, caching |
| Development (Local) | IMAGE_TAG=local docker compose up -d --build |
Local build and testing | Custom changes, SSL, caching, auto-restart |
services:
mqa-app: # React frontend + Express backend
ports:
- "3000:3000" # Frontend
- "3001:3001" # Backend API
nginx: # Reverse proxy (production profile)
ports:
- "80:80" # HTTP
- "443:443" # HTTPS (requires SSL configuration)Environment Variables (.env file):
# Port Configuration
FRONTEND_PORT=3000
BACKEND_PORT=3001
# Application
PUBLIC_URL=/
REACT_APP_BACKEND_URL=http://localhost:3001/api
NODE_ENV=productionCustom Configuration:
volumes:
# Mount custom MQA config
- ./mqa-config.json:/app/build/config/mqa-config.json:ro
# Mount custom vocabularies
- ./public/data:/app/build/data:ro# View logs
docker compose logs -f mqa-app
# Restart services
docker compose restart
# Stop and remove
docker compose down
# Update and rebuild
docker compose up -d --build
# Health check
curl http://localhost:3000/
curl http://localhost:3001/api/healthPre-built Docker images are automatically published to GitHub Container Registry on every push and pull request.
Quick Deploy:
# Pull latest stable version
docker pull ghcr.io/mjanez/metadata-quality-react:latest
# Run with docker
docker run -d -p 3000:3000 -p 3001:3001 \
ghcr.io/mjanez/metadata-quality-react:latest
# Or with docker compose
cat > docker-compose.yml << EOF
services:
mqa-app:
image: ghcr.io/mjanez/metadata-quality-react:latest
ports:
- "3000:3000"
- "3001:3001"
environment:
- NODE_ENV=production
restart: unless-stopped
EOF
docker compose up -dAvailable Tags:
latest- Latest stable version (main branch)develop- Development version (develop branch)pr-<number>- Pull request specific imagev1.2.3- Semantic versioning tags
Multi-Architecture Support: Images built for linux/amd64 and linux/arm64 (Apple Silicon/ARM servers)
See: GHCR Documentation for complete details on image tags, security scanning, and usage.
- Generate SSL certificates (automatic self-signed for development):
# Generate self-signed certificate (local development)
make ssl-generate
# or
./docker/nginx/generate-ssl.sh
# For production: replace with valid certificates
# See docker/README.md#ssl-configuration- Start with nginx profile:
# HTTP automatically redirects to HTTPS
docker compose up -d
# or
make up-prod- Access via HTTPS:
https://localhost (development - will show browser warning)
https://your-domain.com (production with valid certificate)
Note
Self-signed certificates: Browsers will show a security warning. This is normal for local development.
Production: Replace certificates in docker/nginx/ssl/ with valid ones from Let's Encrypt or a CA.
For complete deployment with API backend (Python FastAPI):
git clone https://github.com/mjanez/metadata-quality-stack
cd metadata-quality-stack
docker compose up -dIncludes:
- React frontend (this project)
- Python FastAPI backend
- Streamlit dashboard
- Nginx reverse proxy
- Volume persistence
| Issue | Solution |
|---|---|
| Port in use | Change FRONTEND_PORT or BACKEND_PORT in .env |
| Build fails | docker compose build --no-cache |
| Network issues | docker compose down && docker network prune |
| Permission errors | sudo chown -R $USER:$USER . |
Features: Simple static hosting, free for public repos, no backend support
Important
GitHub Pages Configuration: For GitHub Pages deployment, you MUST disable backend features in mqa-config.json:
{
"backend_server": {
"enabled": false
},
"data_quality": {
"enabled": false
}
}Why: GitHub Pages only serves static files and cannot run backend services. Leaving these enabled will cause:
- Failed API requests and console errors
- Non-functional data quality analysis features
- Degraded user experience with loading states that never complete
- Repository must be public
- GitHub Pages must be enabled in repository settings
# Deploy to GitHub Pages
npm run deployThis command:
- Builds the application with correct
PUBLIC_URL - Deploys to
gh-pagesbranch - Makes it available at:
https://{username}.github.io/{repo-name}/
# 1. Build with correct base path
npm run build
# 2. Deploy using gh-pages
npx gh-pages -d buildFor complete local development including backend:
# Install dependencies and start both servers
./dev-start.sh
# If ports are occupied, clean up first:
./dev-cleanup.sh && ./dev-start.sh# Terminal 1: Start backend server
cd backend
npm install
npm start
# Backend running on http://localhost:3001
# Terminal 2: Start React app
npm install
npm start
# Frontend running on http://localhost:3000Environment Setup:
The .env.local file is automatically configured for local development:
# Local development configuration (already configured)
BROWSER=none
PORT=3000
BACKEND_PORT=3001
REACT_APP_BACKEND_URL=http://localhost:3001/api
REACT_APP_ENV=developmentCustom Configuration:
To modify ports or settings, edit .env.local:
# Change frontend port
PORT=3005
# Change backend port
BACKEND_PORT=3002
REACT_APP_BACKEND_URL=http://localhost:3002/apiDevelopment Script Features:
The ./dev-start.sh script automatically:
- Checks Node.js and npm installation
- Installs dependencies if missing
- Loads environment variables from
.env.local - Starts backend server on configured port (default: 3001)
- Starts frontend development server on configured port (default: 3000)
- Verifies backend health check
- Handles graceful shutdown on Ctrl+C
Backend Configuration:
The backend (backend/server.js) provides:
- CORS proxy for accessing external RDF data
- URL validation to check accessibility
- Data download with SSL certificate handling
- Health check endpoint
- Batch URL validation for performance
API Endpoints:
GET /api/health # Health check
POST /api/validate-url # Single URL validation
POST /api/validate-urls-batch # Batch URL validation (performance)
POST /api/download-data # Download and proxy dataPort Configuration:
| Service | Default Port | Environment Variable | Configuration File |
|---|---|---|---|
| Frontend | 3000 | PORT |
.env.local |
| Backend | 3001 | BACKEND_PORT |
.env.local |
flowchart TB
subgraph App ["App.tsx"]
direction LR
VF["ValidationForm\n(SPARQL Queries)"]
VR["ValidationResults\n(Chart + Metrics)"]
TT["ThemeToggle"]
LS["LanguageSelector"]
VF --> VR
end
App --> MQA["MQAService\n(Metrics Evaluation)"]
App --> SPARQL["SPARQLService\n(Queries Execution)"]
MQA --> RDF["RDFService\n(Parsing & Validation)"]
SPARQL --> CONFIG["mqa-config.json\n(Profiles & Metrics)"]
RDF --> CONFIG
- Configuration Loading:
mqa-config.json→ Services - User Input: Form data → ValidationForm
- SPARQL Integration: Queries → SPARQLService → RDF data
- RDF Processing: Raw data → RDFService → Parsed triples
- Quality Assessment: Triples → MQAService → Metrics scores
- Visualization: Scores → Chart components → User interface
- DCAT-AP 2.1.1: 405 maximum points
- DCAT-AP 3.0.0: 405 maximum points
- DCAT-AP-ES 1.0.0: 405 maximum points
- NTI-RISP 2013: 305 maximum points
| Dimension | Description |
|---|---|
| F - Findability | Ease of finding the dataset |
| A - Accessibility | Data accessibility |
| I - Interoperability | Technical interoperability |
| R - Reusability | Ease of reuse |
| C - Contextuality | Contextual information |
- English (default)
- Spanish (Español)
- Create translation file in
public/locales/{lang}/translation.json - Add language option to
LanguageSelectorcomponent - Update i18n configuration in
src/i18n/index.ts
Translation files in public/locales/{lang}/translation.json include both UI labels and metric definitions:
{
"common": {
"title": "Metadata Quality Assessment",
"loading": "Loading...",
"validate": "Validate"
},
"dimensions": {
"findability": "Findability",
"accessibility": "Accessibility",
"interoperability": "Interoperability",
"reusability": "Reusability",
"contextuality": "Contextuality"
},
"metrics": {
"labels": {
"name": "Métrica",
"score": "Puntuación",
...
},
"specific": {
"dcat_keyword": "Palabras clave",
"dcat_theme": "Temas/Categorías",
"dct_spatial": "Cobertura espacial",
"dct_temporal": "Cobertura temporal",
"dcat_access_url_status": "Disponibilidad de la URL de acceso"
...
}
}
}The metrics.specific section contains all quality metric translations and is integrated with the MQA evaluation system.
Edit these files:
src/App.css- Main application stylessrc/components/*.css- Component-specific styles- Bootstrap variables can be overridden in CSS
:root {
--bs-primary: #0d6efd;
--mqa-chart-bg: #ffffff;
--mqa-text-color: #212529;
}
[data-bs-theme="dark"] {
--mqa-chart-bg: #212529;
--mqa-text-color: #ffffff;
}Controlled vocabularies are stored in public/data/ as JSONL files for efficient loading:
| File | Purpose | Usage |
|---|---|---|
access_rights.jsonl |
Access rights vocabulary | License validation |
file_types.jsonl |
File format types | Format classification |
licenses.jsonl |
License definitions | License compliance |
machine_readable.jsonl |
Machine-readable formats | Interoperability metrics |
media_types.jsonl |
MIME media types | Format validation |
non_proprietary.jsonl |
Non-proprietary formats | Openness assessment |
# Convert CSV vocabularies to JSONL format
python3 scripts/vocabs_csv2jsonl.py# Clear cache
rm -rf node_modules/.cache
npm run build# Check types without build
npx tsc --noEmit# Check gh-pages branch
git checkout gh-pages
git log --oneline -5
# Force redeploy
npm run deploy -- --force# Check translation files
cat public/locales/en/translation.json
cat public/locales/es/translation.json# Validate mqa-config.json syntax
npm run build 2>&1 | grep -i "config\|json"
# Check for missing metric labels
grep -r "metrics.specific" public/locales/Backend Server Configuration Errors:
# Check current backend_server configuration
grep -A 10 '"backend_server"' src/config/mqa-config.json
# Common fixes:
# For GitHub Pages: Set "enabled": false
# For Docker: Set "enabled": true, "url": ""
# For local dev: Set "enabled": true, "url": "http://localhost:3001/api"Data Quality Configuration Errors:
# Check current data_quality configuration
grep -A 5 '"data_quality"' src/config/mqa-config.json
# GitHub Pages: Must be "enabled": false (no backend support)
# Docker/Local: Can be "enabled": true (backend available)Configuration by Deployment Type:
| Deployment | backend_server.enabled | data_quality.enabled | Reason |
|---|---|---|---|
| GitHub Pages | false |
false |
No backend services available |
| Docker | true |
true |
Full backend support with Express API |
| Local Dev | true |
true |
Backend runs on localhost:3001 |
| Static Hosting | false |
false |
Similar to GitHub Pages |
# Port already in use error
./dev-cleanup.sh # Clean up occupied ports
./dev-start.sh # Restart development servers
# Backend connection issues
curl http://localhost:3001/api/health # Check backend health
cat .env.local # Verify configuration
# Frontend can't reach backend
# Check that REACT_APP_BACKEND_URL matches actual backend port
grep REACT_APP_BACKEND_URL .env.local# Analyze bundle size
npm run build && npx webpack-bundle-analyzer build/static/js/*.js
# Check for memory leaks in development
npm start -- --profile-
New Profile Support:
- Update
mqa-config.jsonwith profile definition - Add SHACL validation files
- Create profile icon in
public/img/icons/ - Add translations for profile name and metrics
- Update
-
New Quality Metrics:
- Define metric in
profile_metricssection - Implement evaluation logic in
MQAService.ts - Add metric labels to translation files
- Update documentation
- Define metric in
-
SPARQL Queries:
- Add queries to
sparql_config.queries - Test with debug queries first
- Document parameter usage
- Provide sample data URLs
- Add queries to
- TypeScript: Strict mode enabled, no
anytypes - React: Functional components with hooks
- CSS: Bootstrap 5 + custom CSS variables
- i18n: All user-facing text must be translatable
- Testing: Add tests for new services and components
This project is licensed under the MIT License. See the LICENSE file for details.
Built with ❤️ for the Open Data community