Thanks to visit codestin.com
Credit goes to github.com

Skip to content

rahim8050/String_Server

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

13 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸš€ String Server

A high-performance concurrent TCP string lookup server that performs exact full-line, case-sensitive matches against a configured text file.

  • βœ… Assignment-grade: meets the specification (protocol, config, REREAD_ON_QUERY, TLS toggle, tests, benchmarks)
  • πŸ› οΈ Production-grade: strict linting/typing/security checks, configurable, daemon-ready, and testable in a clean Linux environment

✨ Features

  • πŸ”Ž Exact full-line match
    • Case-sensitive
    • No partial matches (substring hits are rejected)
  • πŸ“‚ Config-driven data path
    • Supports the spec-required linuxpath=... line
    • Also supports structured keys like FILE_PATH, DATA_PATH, LINUX_PATH
  • πŸ” REREAD_ON_QUERY mode
    • False: load file once at startup for minimum latency
    • True: re-open file on every query to reflect changes
  • 🧡 Concurrent handling
    • Multi-threaded server to handle many clients in parallel
  • πŸ” Optional TLS
    • Certificate-based TLS via Python ssl
    • Enabled/disabled via configuration (SSL_ENABLED)
  • 🧾 Deterministic debug logging
    • One DEBUG: line per request with timestamp, IP, query and execution time
  • πŸ“Š Benchmark tooling
    • Dedicated benchmark client and harness
    • Used to generate a speed report (multiple algorithms, both modes)

🧠 Protocol & Semantics

Request

  • Client connects via TCP.
  • Server reads up to 1024 bytes from the connection.
  • Processing rules:
    • Strip any trailing \x00 bytes.
    • Use only the first line of the payload as the query string.
    • Perform an exact full-line, case-sensitive match against the configured data file. Partial line matches are not counted as hits.

Response

For each query, the server writes:

  1. A single DEBUG: line containing:
    • Timestamp in UTC
    • Client IP address
    • Query string
    • Measured execution time in milliseconds
  2. A result line:
    • STRING EXISTS or
    • STRING NOT FOUND
  3. Each line is terminated with a newline (\n).

Example:

printf "test\n" | nc 127.0.0.1 9000

Output:

DEBUG: ts=2025-12-09T10:55:08.093Z ip=127.0.0.1 query="test" exec_ms=0.020
STRING EXISTS

⚑ Quick Start

1) Create virtualenv & install dependencies

python -m venv .venv
source .venv/bin/activate

pip install -r requirements.txt
pip install -r requirements-dev.txt

2) Point the server at a config file

The server expects an environment variable that points to the config file:

export STRING_SERVER_CONFIG=./config.ini

You can verify:

echo "$STRING_SERVER_CONFIG"

3) Run the server (foreground)

python -m string_server.server

4) Smoke test over TCP

printf "test\n" | nc 127.0.0.1 9000

You should see a DEBUG: line followed by either STRING EXISTS or STRING NOT FOUND.


🧩 Configuration

The server reads configuration from an INI file via Config. It is designed to handle extra keys that are irrelevant to the server without failing.

βœ… Example config.ini

[DEFAULT]
HOST=127.0.0.1
PORT=9000

# Required by specification:
# This line is used to locate the data file and may coexist with other keys.
linuxpath=tests/data/sample_10k.txt

# Internal file path keys (server supports any of these):
# FILE_PATH, DATA_PATH, LINUX_PATH, WINDOWS_PATH
LINUX_PATH=tests/data/sample_10k.txt

# Performance mode:
# false: preload once (fastest, file considered stable)
# true: re-open and re-parse per query (file may change rapidly)
REREAD_ON_QUERY=false

# Optional TLS
SSL_ENABLED=false
CERT_PATH=./utils/cert.pem
KEY_PATH=./utils/key.pem

[BENCHMARK]
# Default benchmark query and parameters:
QUERY=test
ITERATIONS=5000
WARMUP=500

πŸ“„ Data file rules

  • One entry per line in the data file.
  • Matching is:
    • exact (no substring matches),
    • full-line (the query must match a complete line),
    • case-sensitive.
  • For portability across tools and platforms, ensure there is a newline at the end of the file.

πŸ” TLS / SSL Configuration (Optional)

The server supports certificate-based TLS, controlled entirely via config.

1) Generate a self-signed certificate (development/testing)

openssl req -x509 -newkey rsa:2048 \
  -keyout utils/key.pem -out utils/cert.pem \
  -nodes -days 365

2) Enable TLS in config.ini

SSL_ENABLED=true
CERT_PATH=./utils/cert.pem
KEY_PATH=./utils/key.pem

3) Test a TLS connection

printf "test\n" | openssl s_client -connect 127.0.0.1:9000 -quiet

You should see the same DEBUG: + result response, but now over TLS.

Note: PSK TLS is not implemented. The server uses certificate-based authentication as allowed by the specification.


πŸ” REREAD_ON_QUERY Semantics

The REREAD_ON_QUERY option controls how aggressively the server tracks changes to the data file:

  • REREAD_ON_QUERY=false
    • Data file is read once at server startup.
    • Lookups are performed against an in-memory structure.
    • Target average latency: ~0.5 ms per query for large files (10k–250k lines), depending on hardware.
  • REREAD_ON_QUERY=true
    • The data file is re-opened and re-read on every query.
    • The server always sees the latest contents of the file.
    • Target average latency: ≀ 40 ms per query for large files, depending on hardware.

The benchmark tooling (see below) is used to verify that the chosen search algorithm(s) meet these thresholds.


πŸ“Š Benchmarking & Speed Report

The repository includes a benchmark harness and client designed to:

  • Compare at least 5 different search strategies.
  • Measure performance for:
    • REREAD_ON_QUERY=true
    • REREAD_ON_QUERY=false
  • Generate measurements for different file sizes (from 10,000 to 1,000,000 lines).
  • Provide data for:
    • Average latency
    • p50 / p95 / p99 percentiles
    • Throughput under increased concurrency

1) Running the benchmark harness

python benchmark_speed.py

This script:

  • Loads configuration from config.ini (including [BENCHMARK] section).
  • Invokes client.py-style logic to send many queries.
  • Prints summary statistics for each algorithm and mode.

2) Speed report (PDF)

Benchmark results are summarized in a separate PDF:

  • Path (example): docs/speed_report.pdf
  • Contains:
    • A table comparing at least 5 algorithms (sorted by performance).
    • At least one chart showing performance vs. file size.
    • Results for both REREAD_ON_QUERY=true and false.
    • Justification for the algorithm chosen as the default implementation in the server.

The server code in search_engine.py uses the fastest algorithm determined by these benchmarks for each mode.


πŸ§ͺ Testing & Quality Gates

The server is designed to be testable in a clean environment with no hardcoded paths. All test data lives under tests/data/.

1) Run unit tests with coverage

coverage erase
coverage run --source=string_server -m pytest -q
coverage report -m --fail-under=100

2) Verify tests in a minimal environment

To confirm that the test suite does not depend on your shell environment:

env -i PATH=/usr/bin:/bin PYTHONHASHSEED=0 LC_ALL=C TZ=UTC \
  ./.venv/bin/python -m pytest -q

3) Linting, typing, security

ruff format .
ruff check .

mypy .

bandit -r .

flake8 .
  • flake8 is configured with max-line-length = 79 and the usual modern ignores (E203, W503).
  • mypy enforces static typing.
  • bandit provides a baseline security scan.
  • ruff handles linting + formatting for Python.

🧯 Troubleshooting

🧷 Missing config environment variable

If you see an error about missing configuration:

export STRING_SERVER_CONFIG=./config.ini

Verify:

echo "$STRING_SERVER_CONFIG"

Security checks

Static analysis is performed with Bandit:

bandit -r string_server utils tests -x .venv,venv -ll -f txt -o bandit_report.txt

πŸ—‚οΈ Confirm the resolved data file path

python - << 'EOF'
import os
from string_server.config import Config

cfg = Config(os.environ["STRING_SERVER_CONFIG"])
print("Resolved file path:", cfg.file_path)
EOF

πŸ”Œ Port already in use

ss -lptn 'sport = :9000' || true

Stop or reconfigure any conflicting processes.

🧊 Windows / WSL notes

For consistent benchmark and test results, run both the server and client in the same environment (either inside WSL or inside a Linux container). Avoid mixing Windows paths and Linux paths in the same run.


πŸ›Ÿ Running as a Linux Service (Daemon)

The server can be run as a systemd service to keep it running in the background and restart automatically on failure.

1) Install the project (example)

Install your project and configuration under /opt and /etc:

sudo mkdir -p /opt/string_server
sudo mkdir -p /etc/string_server

# Copy code and config (adjust paths as needed)
sudo cp -r . /opt/string_server
sudo cp config.ini /etc/string_server/config.ini

2) Create a systemd unit file

Create /etc/systemd/system/string_server.service:

[Unit]
Description=String Server
After=network.target

[Service]
Type=simple
WorkingDirectory=/opt/string_server
Environment=STRING_SERVER_CONFIG=/etc/string_server/config.ini
ExecStart=/usr/bin/env python -m string_server.server
Restart=on-failure
RestartSec=2
User=stringserver
Group=stringserver

[Install]
WantedBy=multi-user.target

Adjust User/Group as needed (or create a dedicated user).

3) Enable and start the service

sudo systemctl daemon-reload
sudo systemctl enable string_server.service
sudo systemctl start string_server.service
sudo systemctl status string_server.service

Once running, you can connect with nc or client.py exactly as in foreground mode.


πŸ—ΊοΈ Project Layout

string_server/
  __init__.py
  config.py          # Config loader, path resolution, validation
  server.py          # TCP server, threading, TLS, REREAD_ON_QUERY logic
  client.py          # Test/benchmark client used in speed testing
  search_engine.py   # Pluggable search algorithms

utils/
  logger.py          # Centralized logging configuration
  cert.pem           # (Optional) TLS certificate for dev/test
  key.pem            # (Optional) TLS private key for dev/test

tests/
  test_config.py
  test_config_branches.py
  test_server.py
  test_server_more_branches.py
  test_search_engine_algos.py
  data/
    sample_10k.txt
    # Other test data files

benchmark_speed.py   # Benchmark harness for speed report
config.ini           # Example/default config
README.md            # This document

🏁 Summary

  • βœ… Exact, full-line, case-sensitive string lookup server
  • βœ… Config-driven behavior (STRING_SERVER_CONFIG) with linuxpath= support
  • βœ… Optional TLS, configurable via SSL_ENABLED
  • βœ… Concurrency via threading, safe handling of many clients
  • βœ… REREAD_ON_QUERY semantics implemented and benchmarked
  • βœ… Strict linting, typing, and security checks
  • βœ… Test suite designed to run cleanly on a fresh Linux machine

πŸ‘€ Author

Rahim
Backend Engineer Β· DevOps / Systems Engineering
Focused on speed, clarity, and engineering discipline.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors