Analytics API - Turnkey ML Platform

Upload anything → Get anomaly alerts & forecasts

Production-ready REST API with:

📝 Log Anomaly Detection (TF-IDF + DBSCAN for text)
🔍 Outlier Detection (IQR, Z-score, Isolation Forest)
📈 Time-Series Forecasting (Linear, Exponential, Prophet)
🧠 Unified Model (One model, all data, continuous learning)
📊 Smart Upload (Auto-detects CSV, Excel, JSON, Logs)

55,000+ rows tested | 6 datasets | 1 unified model

Quick Start

# Install & generate data
make install
make data

# Analyze a dataset (NO CURL NEEDED!)
python app.py analyze data/sensor_data.csv temperature

# Train unified model
python app.py train data/sales_data.csv sales

# Check status
python app.py status

# Start server
python app.py

# Or use Makefile shortcuts
make analyze-sensors  # Analyze 20K sensor readings
make train            # Train on all 55K samples
make help             # See all commands

Features

✨ Best-in-class lightweight algorithms:

Outlier Detection: IQR, Z-score, Isolation Forest
Linear Regression: Ridge regression with automatic feature scaling
Time-Series Forecasting: Exponential smoothing & linear trend
Model Training & Persistence: Train once, predict many times

🚀 Production-ready:

Thread-safe data & model storage
Rate limiting & security headers
Prometheus metrics
Health checks & monitoring
Docker support

Quick Start

1. Install Dependencies

pip install -r requirements.txt

2. Run the Server

python app.py

Server starts at http://localhost:5000

3. Test the API

python test.py

Sample Data

Use the ready-to-go datasets in sample_data/ to exercise every endpoint without having to craft payloads from scratch:

File	Format	Purpose
`sample_data/sales_data.csv`	CSV	Multivariate retail metrics with a deliberate sales spike to test regression & outlier detection
`sample_data/sales_data.xlsx`	Excel	Same data for validating Excel ingestion
`sample_data/sales_data.json`	JSON	Direct upload example for the REST API
`sample_data/energy_usage.csv`	CSV	Clean time-series readings for forecasting
`sample_data/energy_usage.json`	JSON	Alternative time-series payload
`sample_data/system_logs.txt`	Text	Small log file you can vectorize or preprocess before uploading

Quick functional walkthrough

Upload the CSV/JSON data

python - <<'PY'
import pandas as pd, requests
df = pd.read_csv('sample_data/sales_data.csv')
payload = {"name": "sales_sample", "data": df.to_dict('records')}
res = requests.post('http://localhost:5000/upload_data', json=payload, timeout=10)
res.raise_for_status()
print(res.json())
PY

Detect outliers:

curl -s http://localhost:5000/detect_outliers \
  -H 'Content-Type: application/json' \
  -d '{"dataset": "sales_sample", "method": "iforest", "columns": ["sales", "marketing_spend"]}' | jq .statistics

Run regression:

curl -s http://localhost:5000/regression \
  -H 'Content-Type: application/json' \
  -d '{"dataset": "sales_sample", "target": "sales", "features": ["marketing_spend", "temperature"], "model_name": "sales_model"}' | jq .metrics

Forecast with the energy dataset:

python - <<'PY'
import pandas as pd, requests
df = pd.read_csv('sample_data/energy_usage.csv')
payload = {"name": "energy_sample", "data": df.to_dict('records')}
requests.post('http://localhost:5000/upload_data', json=payload, timeout=10).raise_for_status()
res = requests.post('http://localhost:5000/forecast', json={
    "dataset": "energy_sample",
    "value_column": "energy_usage",
    "method": "linear",
    "horizon": 5
}, timeout=10)
print(res.json()["forecast"])
PY

Run the automated suite once the data is loaded:
```
pytest tests/test_analytics.py -v
```

Core Endpoints

Upload Data

POST /upload_data
{"name": "sales", "data": [{"col1": 10, "col2": 20}]}

Detect Outliers

POST /detect_outliers
{"dataset": "sales", "method": "iqr"}
# methods: iqr, zscore, iforest

Train Model

POST /train_model
{"dataset": "sales", "target": "col1", "model_name": "predictor"}

Predict

POST /predict
{"model_name": "predictor", "data": [{"col2": 25}]}

Forecast

POST /forecast
{"dataset": "sales", "value_column": "col1", "method": "linear", "horizon": 10}
# methods: linear, exponential

Regression

POST /regression
{"dataset": "sales", "target": "col1", "features": ["col2"]}

Example Usage

import requests

# Upload data
requests.post('http://localhost:5000/upload_data', json={
    "name": "sales",
    "data": [
        {"date": "2024-01-01", "sales": 100, "marketing": 20},
        {"date": "2024-01-02", "sales": 120, "marketing": 25}
    ]
})

# Detect outliers
result = requests.post('http://localhost:5000/detect_outliers', json={
    "dataset": "sales",
    "method": "iforest"
})
print(f"Found {result.json()['n_outliers']} outliers")

# Train model
requests.post('http://localhost:5000/train_model', json={
    "dataset": "sales",
    "target": "sales",
    "model_name": "predictor",
    "features": ["marketing"]
})

# Predict
predictions = requests.post('http://localhost:5000/predict', json={
    "model_name": "predictor",
    "data": [{"marketing": 30}]
})
print(predictions.json()['predictions'])

Configuration

Create .env file (see .env.example):

SECRET_KEY=your-secret-key
RATE_LIMIT_PER_MINUTE=60
MAX_BATCH_SIZE=10000
MODEL_DIR=models

Docker

docker-compose up -d

Algorithm Details

IQR: Fast, no training. Flags values outside Q1-1.5×IQR to Q3+1.5×IQR
Z-Score: Assumes normal distribution. Flags |z| > 3
Isolation Forest: Best for high-dimensional, non-linear patterns
Ridge Regression: L2 regularization, automatic scaling
Exponential Smoothing: Fast, good for stationary data
Linear Trend: Simple, interpretable forecasting

Performance

100K rows in memory
Sub-second responses
Isolation Forest: ~100ms for 10K rows
Regression: ~50ms for 10K rows

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
data		data
tests		tests
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
API_DOCUMENTATION.md		API_DOCUMENTATION.md
Dockerfile		Dockerfile
LICENSE		LICENSE
Makefile		Makefile
app.py		app.py
config.py		config.py
docker-compose.yml		docker-compose.yml
generate_datasets.py		generate_datasets.py
readme.md		readme.md
requirements.txt		requirements.txt
test.py		test.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Analytics API - Turnkey ML Platform

Quick Start

Features

Quick Start

1. Install Dependencies

2. Run the Server

3. Test the API

Sample Data

Quick functional walkthrough

Core Endpoints

Upload Data

Detect Outliers

Train Model

Predict

Forecast

Regression

Example Usage

Configuration

Docker

Algorithm Details

Performance

License

About

Uh oh!

Releases 1

Packages

Uh oh!

Languages

License

m1rl0k/LogAnalysis

Folders and files

Latest commit

History

Repository files navigation

Analytics API - Turnkey ML Platform

Quick Start

Features

Quick Start

1. Install Dependencies

2. Run the Server

3. Test the API

Sample Data

Quick functional walkthrough

Core Endpoints

Upload Data

Detect Outliers

Train Model

Predict

Forecast

Regression

Example Usage

Configuration

Docker

Algorithm Details

Performance

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Languages

Packages