DevOps / Infrastructure Engineer · R&D · AI & Experimental Systems
I design, deploy, and keep alive complex infrastructure for R&D, AI-driven, and experimental projects. My focus is reliability under uncertainty: self-healing systems, reproducible environments, and automation that reduces cognitive load.
I work at the intersection of DevOps, research infrastructure, and emerging technologies, where systems are unstable by default and must be made predictable.
Solid background in enterprise infrastructure and systems administration: Linux-based servers, distributed environments, backup and recovery design, monitoring, secure remote access (VPN/IPSec), and network troubleshooting. Hands-on experience with building and operating production systems, supporting geographically distributed sites, and maintaining reliability under real-world constraints. Strong foundation in networking, security, and operational discipline, later evolved into modern DevOps and infrastructure engineering practices.
- Build and maintain Linux-based infrastructure (workstations, servers, VDS/VPS)
- Design self-healing and fault-tolerant setups (systemd, watchdogs, auto-recovery)
- Diagnose hard-to-track issues: networking, services, hardware, race conditions
- Automate routine operations to keep systems calm and observable
- Support R&D and AI/LLM workflows (data pipelines, RAG backends, agent systems)
- Turn “it works sometimes” into “it works reliably”
- Linux: Debian, Ubuntu, Manjaro, systemd, zsh
- Infrastructure: bare metal, VPS/VDS, remote access, reverse connections
- Networking: DNS, mDNS (Avahi), VPN, troubleshooting unstable networks
- Containers: Docker, docker-compose
- Databases: PostgreSQL (incl. as backbone for knowledge / RAG systems)
- Automation: shell scripting, recovery workflows, service orchestration
- Security: SSH keys, access control, secrets handling
- Diagnostics: logs, metrics, failure analysis under real load
- Migration and support of heterogeneous environments (laptops → workstations → remote servers)
- “Unkillable” services: reverse access, auto-restart, escalation logic
- Infrastructure for AI agents, RAG systems, and experimental software
- Integration of local, cloud, and hybrid setups
- Working in environments where documentation is incomplete and failure is expected
- Infrastructure is a system, not a pile of services
- Stability > features
- Automate everything repeatable
- Reduce human stress before scaling machines
- Prefer simple, inspectable solutions over magic
- Early-stage startups
- Research and experimental labs
- AI / LLM / data-heavy projects
- Teams that need someone who can both think and execute
- GitHub: https://github.com/t1p
