Thanks to visit codestin.com
Credit goes to github.com

Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
45 changes: 45 additions & 0 deletions docs/reference/feature-servers/python-feature-server.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,51 @@ The Python feature server is an HTTP endpoint that serves features with JSON I/O

There is a CLI command that starts the server: `feast serve`. By default, Feast uses port 6566; the port be overridden with a `--port` flag.

### Performance Configuration

For production deployments, the feature server supports several performance optimization options:

```bash
# Basic usage
feast serve

# Production configuration with multiple workers
feast serve --workers -1 --worker-connections 1000 --registry_ttl_sec 60

# Manual worker configuration
feast serve --workers 8 --worker-connections 2000 --max-requests 1000
```

Key performance options:
- `--workers, -w`: Number of worker processes. Use `-1` to auto-calculate based on CPU cores (recommended for production)
- `--worker-connections`: Maximum simultaneous clients per worker process (default: 1000)
- `--max-requests`: Maximum requests before worker restart, prevents memory leaks (default: 1000)
- `--max-requests-jitter`: Jitter to prevent thundering herd on worker restart (default: 50)
- `--registry_ttl_sec, -r`: Registry refresh interval in seconds. Higher values reduce overhead but increase staleness (default: 60)
- `--keep-alive-timeout`: Keep-alive connection timeout in seconds (default: 30)

### Performance Best Practices

**Worker Configuration:**
- For production: Use `--workers -1` to auto-calculate optimal worker count (2 × CPU cores + 1)
- For development: Use default single worker (`--workers 1`)
- Monitor CPU and memory usage to tune worker count manually if needed

**Registry TTL:**
- Production: Use `--registry_ttl_sec 60` or higher to reduce refresh overhead
- Development: Use lower values (5-10s) for faster iteration when schemas change frequently
- Balance between performance (higher TTL) and freshness (lower TTL)

**Connection Tuning:**
- Increase `--worker-connections` for high-concurrency workloads
- Use `--max-requests` to prevent memory leaks in long-running deployments
- Adjust `--keep-alive-timeout` based on client connection patterns

**Container Deployments:**
- Set appropriate CPU/memory limits in Kubernetes to match worker configuration
- Use HTTP health checks instead of TCP for better application-level monitoring
- Consider horizontal pod autoscaling based on request latency metrics

## Deploying as a service

See [this](../../how-to-guides/running-feast-in-production.md#id-4.2.-deploy-feast-feature-servers-on-kubernetes) for an example on how to run Feast on Kubernetes using the Operator.
Expand Down
45 changes: 38 additions & 7 deletions sdk/python/feast/cli/serve.py
Original file line number Diff line number Diff line change
Expand Up @@ -52,21 +52,42 @@
type=click.INT,
default=1,
show_default=True,
help="Number of worker",
help="Number of worker processes. Use -1 to auto-calculate based on CPU cores",
)
@click.option(
"--worker-connections",
type=click.INT,
default=1000,
show_default=True,
help="Maximum number of simultaneous clients per worker process",
)
@click.option(
"--max-requests",
type=click.INT,
default=1000,
show_default=True,
help="Maximum number of requests a worker will process before restarting (prevents memory leaks)",
)
@click.option(
"--max-requests-jitter",
type=click.INT,
default=50,
show_default=True,
help="Maximum jitter to add to max-requests to prevent thundering herd on worker restart",
)
@click.option(
"--keep-alive-timeout",
type=click.INT,
default=5,
default=30,
show_default=True,
help="Timeout for keep alive",
help="Timeout for keep alive connections (seconds)",
)
@click.option(
"--registry_ttl_sec",
"-r",
help="Number of seconds after which the registry is refreshed",
help="Number of seconds after which the registry is refreshed. Higher values reduce refresh overhead but increase staleness",
type=click.INT,
default=5,
default=60,
show_default=True,
)
@click.option(
Expand Down Expand Up @@ -102,11 +123,14 @@ def serve_command(
type_: str,
no_access_log: bool,
workers: int,
metrics: bool,
worker_connections: int,
max_requests: int,
max_requests_jitter: int,
keep_alive_timeout: int,
registry_ttl_sec: int,
tls_key_path: str,
tls_cert_path: str,
registry_ttl_sec: int = 5,
metrics: bool,
):
"""Start a feature server locally on a given port."""
if (tls_key_path and not tls_cert_path) or (not tls_key_path and tls_cert_path):
Expand All @@ -115,12 +139,19 @@ def serve_command(
)
store = create_feature_store(ctx)

# Auto-calculate workers if -1 is specified
if workers == -1:
workers = max(1, (multiprocessing.cpu_count() * 2) + 1)

store.serve(
host=host,
port=port,
type_=type_,
no_access_log=no_access_log,
workers=workers,
worker_connections=worker_connections,
max_requests=max_requests,
max_requests_jitter=max_requests_jitter,
metrics=metrics,
keep_alive_timeout=keep_alive_timeout,
tls_key_path=tls_key_path,
Expand Down
6 changes: 6 additions & 0 deletions sdk/python/feast/feature_server.py
Original file line number Diff line number Diff line change
Expand Up @@ -796,6 +796,9 @@ def start_server(
port: int,
no_access_log: bool,
workers: int,
worker_connections: int,
max_requests: int,
max_requests_jitter: int,
keep_alive_timeout: int,
registry_ttl_sec: int,
tls_key_path: str,
Expand Down Expand Up @@ -833,6 +836,9 @@ def start_server(
"bind": f"{host}:{port}",
"accesslog": None if no_access_log else "-",
"workers": workers,
"worker_connections": worker_connections,
"max_requests": max_requests,
"max_requests_jitter": max_requests_jitter,
"keepalive": keep_alive_timeout,
"registry_ttl_sec": registry_ttl_sec,
}
Expand Down
8 changes: 7 additions & 1 deletion sdk/python/feast/feature_store.py
Original file line number Diff line number Diff line change
Expand Up @@ -2639,11 +2639,14 @@ def serve(
type_: str = "http",
no_access_log: bool = True,
workers: int = 1,
worker_connections: int = 1000,
max_requests: int = 1000,
max_requests_jitter: int = 50,
metrics: bool = False,
keep_alive_timeout: int = 30,
tls_key_path: str = "",
tls_cert_path: str = "",
registry_ttl_sec: int = 2,
registry_ttl_sec: int = 60,
) -> None:
"""Start the feature consumption server locally on a given port."""
type_ = type_.lower()
Expand All @@ -2658,6 +2661,9 @@ def serve(
port=port,
no_access_log=no_access_log,
workers=workers,
worker_connections=worker_connections,
max_requests=max_requests,
max_requests_jitter=max_requests_jitter,
metrics=metrics,
keep_alive_timeout=keep_alive_timeout,
tls_key_path=tls_key_path,
Expand Down
Loading