A high-performance concurrent TCP string lookup server that performs exact full-line, case-sensitive matches against a configured text file.
- β Assignment-grade: meets the specification (protocol, config, REREAD_ON_QUERY, TLS toggle, tests, benchmarks)
- π οΈ Production-grade: strict linting/typing/security checks, configurable, daemon-ready, and testable in a clean Linux environment
- π Exact full-line match
- Case-sensitive
- No partial matches (substring hits are rejected)
- π Config-driven data path
- Supports the spec-required
linuxpath=...line - Also supports structured keys like
FILE_PATH,DATA_PATH,LINUX_PATH
- Supports the spec-required
- π REREAD_ON_QUERY mode
False: load file once at startup for minimum latencyTrue: re-open file on every query to reflect changes
- π§΅ Concurrent handling
- Multi-threaded server to handle many clients in parallel
- π Optional TLS
- Certificate-based TLS via Python
ssl - Enabled/disabled via configuration (
SSL_ENABLED)
- Certificate-based TLS via Python
- π§Ύ Deterministic debug logging
- One
DEBUG:line per request with timestamp, IP, query and execution time
- One
- π Benchmark tooling
- Dedicated benchmark client and harness
- Used to generate a speed report (multiple algorithms, both modes)
- Client connects via TCP.
- Server reads up to 1024 bytes from the connection.
- Processing rules:
- Strip any trailing
\x00bytes. - Use only the first line of the payload as the query string.
- Perform an exact full-line, case-sensitive match against the configured data file. Partial line matches are not counted as hits.
- Strip any trailing
For each query, the server writes:
- A single
DEBUG:line containing:- Timestamp in UTC
- Client IP address
- Query string
- Measured execution time in milliseconds
- A result line:
STRING EXISTSorSTRING NOT FOUND
- Each line is terminated with a newline (
\n).
Example:
printf "test\n" | nc 127.0.0.1 9000Output:
DEBUG: ts=2025-12-09T10:55:08.093Z ip=127.0.0.1 query="test" exec_ms=0.020
STRING EXISTS
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
pip install -r requirements-dev.txtThe server expects an environment variable that points to the config file:
export STRING_SERVER_CONFIG=./config.iniYou can verify:
echo "$STRING_SERVER_CONFIG"python -m string_server.serverprintf "test\n" | nc 127.0.0.1 9000You should see a DEBUG: line followed by either STRING EXISTS or
STRING NOT FOUND.
The server reads configuration from an INI file via Config. It is designed to
handle extra keys that are irrelevant to the server without failing.
[DEFAULT]
HOST=127.0.0.1
PORT=9000
# Required by specification:
# This line is used to locate the data file and may coexist with other keys.
linuxpath=tests/data/sample_10k.txt
# Internal file path keys (server supports any of these):
# FILE_PATH, DATA_PATH, LINUX_PATH, WINDOWS_PATH
LINUX_PATH=tests/data/sample_10k.txt
# Performance mode:
# false: preload once (fastest, file considered stable)
# true: re-open and re-parse per query (file may change rapidly)
REREAD_ON_QUERY=false
# Optional TLS
SSL_ENABLED=false
CERT_PATH=./utils/cert.pem
KEY_PATH=./utils/key.pem
[BENCHMARK]
# Default benchmark query and parameters:
QUERY=test
ITERATIONS=5000
WARMUP=500- One entry per line in the data file.
- Matching is:
- exact (no substring matches),
- full-line (the query must match a complete line),
- case-sensitive.
- For portability across tools and platforms, ensure there is a newline at the end of the file.
The server supports certificate-based TLS, controlled entirely via config.
openssl req -x509 -newkey rsa:2048 \
-keyout utils/key.pem -out utils/cert.pem \
-nodes -days 365SSL_ENABLED=true
CERT_PATH=./utils/cert.pem
KEY_PATH=./utils/key.pemprintf "test\n" | openssl s_client -connect 127.0.0.1:9000 -quietYou should see the same DEBUG: + result response, but now over TLS.
Note: PSK TLS is not implemented. The server uses certificate-based authentication as allowed by the specification.
The REREAD_ON_QUERY option controls how aggressively the server tracks changes
to the data file:
REREAD_ON_QUERY=false- Data file is read once at server startup.
- Lookups are performed against an in-memory structure.
- Target average latency: ~0.5 ms per query for large files (10kβ250k lines), depending on hardware.
REREAD_ON_QUERY=true- The data file is re-opened and re-read on every query.
- The server always sees the latest contents of the file.
- Target average latency: β€ 40 ms per query for large files, depending on hardware.
The benchmark tooling (see below) is used to verify that the chosen search algorithm(s) meet these thresholds.
The repository includes a benchmark harness and client designed to:
- Compare at least 5 different search strategies.
- Measure performance for:
REREAD_ON_QUERY=trueREREAD_ON_QUERY=false
- Generate measurements for different file sizes (from 10,000 to 1,000,000 lines).
- Provide data for:
- Average latency
- p50 / p95 / p99 percentiles
- Throughput under increased concurrency
python benchmark_speed.pyThis script:
- Loads configuration from
config.ini(including[BENCHMARK]section). - Invokes
client.py-style logic to send many queries. - Prints summary statistics for each algorithm and mode.
Benchmark results are summarized in a separate PDF:
- Path (example):
docs/speed_report.pdf - Contains:
- A table comparing at least 5 algorithms (sorted by performance).
- At least one chart showing performance vs. file size.
- Results for both
REREAD_ON_QUERY=trueandfalse. - Justification for the algorithm chosen as the default implementation in the server.
The server code in search_engine.py uses the fastest algorithm determined
by these benchmarks for each mode.
The server is designed to be testable in a clean environment with no hardcoded
paths. All test data lives under tests/data/.
coverage erase
coverage run --source=string_server -m pytest -q
coverage report -m --fail-under=100To confirm that the test suite does not depend on your shell environment:
env -i PATH=/usr/bin:/bin PYTHONHASHSEED=0 LC_ALL=C TZ=UTC \
./.venv/bin/python -m pytest -qruff format .
ruff check .
mypy .
bandit -r .
flake8 .flake8is configured withmax-line-length = 79and the usual modern ignores (E203,W503).mypyenforces static typing.banditprovides a baseline security scan.ruffhandles linting + formatting for Python.
If you see an error about missing configuration:
export STRING_SERVER_CONFIG=./config.iniVerify:
echo "$STRING_SERVER_CONFIG"Static analysis is performed with Bandit:
bandit -r string_server utils tests -x .venv,venv -ll -f txt -o bandit_report.txtpython - << 'EOF'
import os
from string_server.config import Config
cfg = Config(os.environ["STRING_SERVER_CONFIG"])
print("Resolved file path:", cfg.file_path)
EOFss -lptn 'sport = :9000' || trueStop or reconfigure any conflicting processes.
For consistent benchmark and test results, run both the server and client in the same environment (either inside WSL or inside a Linux container). Avoid mixing Windows paths and Linux paths in the same run.
The server can be run as a systemd service to keep it running in the background and restart automatically on failure.
Install your project and configuration under /opt and /etc:
sudo mkdir -p /opt/string_server
sudo mkdir -p /etc/string_server
# Copy code and config (adjust paths as needed)
sudo cp -r . /opt/string_server
sudo cp config.ini /etc/string_server/config.iniCreate /etc/systemd/system/string_server.service:
[Unit]
Description=String Server
After=network.target
[Service]
Type=simple
WorkingDirectory=/opt/string_server
Environment=STRING_SERVER_CONFIG=/etc/string_server/config.ini
ExecStart=/usr/bin/env python -m string_server.server
Restart=on-failure
RestartSec=2
User=stringserver
Group=stringserver
[Install]
WantedBy=multi-user.targetAdjust User/Group as needed (or create a dedicated user).
sudo systemctl daemon-reload
sudo systemctl enable string_server.service
sudo systemctl start string_server.service
sudo systemctl status string_server.serviceOnce running, you can connect with nc or client.py exactly as in foreground
mode.
string_server/
__init__.py
config.py # Config loader, path resolution, validation
server.py # TCP server, threading, TLS, REREAD_ON_QUERY logic
client.py # Test/benchmark client used in speed testing
search_engine.py # Pluggable search algorithms
utils/
logger.py # Centralized logging configuration
cert.pem # (Optional) TLS certificate for dev/test
key.pem # (Optional) TLS private key for dev/test
tests/
test_config.py
test_config_branches.py
test_server.py
test_server_more_branches.py
test_search_engine_algos.py
data/
sample_10k.txt
# Other test data files
benchmark_speed.py # Benchmark harness for speed report
config.ini # Example/default config
README.md # This document
- β Exact, full-line, case-sensitive string lookup server
- β
Config-driven behavior (
STRING_SERVER_CONFIG) withlinuxpath=support - β
Optional TLS, configurable via
SSL_ENABLED - β Concurrency via threading, safe handling of many clients
- β REREAD_ON_QUERY semantics implemented and benchmarked
- β Strict linting, typing, and security checks
- β Test suite designed to run cleanly on a fresh Linux machine
Rahim
Backend Engineer Β· DevOps / Systems Engineering
Focused on speed, clarity, and engineering discipline.