T24gdGhlIHdheSBTaXRlIFJlbGlhYmlsaXR5IEVuZ2luZWVyIHdpdGggU2VydmVyIFJlYm9vdCBFbmdpbmVlciBleHBlcmllbmNlIGFzIHdlbGwgYXMgU3lzdGVtIFJlYm9vdCBFbmdpbmVlciBhbmQgU2VydmljZSBSZXN0YXJ0IEVuZ2luZWVyIGV4cGVyaWVuY2Uu
Q3VycmVudGx5IHB1cnN1aW5nIGJhY2hlbG9yIGRlZ3JlZSBpbiBDb21wdXRlciBTY2llbmNlLiBPcGVuIFNvdXJjZSBzb2Z0d2FyZSBleGNpdGVzIG1lLCBhbmQgSSBhbSBldmVyIHJlYWR5IHRvIGxlYXJuIG1vcmUu
U2tpbGxzOiBMaW51eCwgQmFzaCwgUHl0aG9uLCBBbnNpYmxlLCBQcm9tZXRoZXVzLCBQTEcgc3RhY2ssIERvY2tlciwgS3ViZXJuZXRlcywgR2l0bGFiQ0ksIEdpdEh1YiBBY3Rpb25zLCBKZW5raW5zIC4uLg
- 
🔭 SeKAmW0gY3VycmVudGx5IHdvcmtpbmcgZm9yIGEgZS1jb21tZXJjZSBDb21wYW55Lg 
- 
🌱 SeKAmW0gY3VycmVudGx5IGxlYXJuaW5nIGFib3V0IExpbnV4LCBPcGVuU291cmNlLCBLdWJlcm5ldGVzLCBIZWxtLCBTZWN1cml0eSwgR29sYW5nLg 
- 
😄 SG9iYmllcyA6IEZvb3RiYWxsIDpzb2NjZXI6LCBTd2ltbWluZyA6c3dpbW1lcjosIEpvZ2dpbmcgOnJ1bm5lcjo 
- 
🔨 Tools: Vim , Xcode , Visual Studio Code , Atom , Postman , Jira , SonarQube , Neovim . 
- 
🤿 DevOps: Bash , Docker , Kubernetes , CI/CD , Jenkins , Grafana , Loki , Prometheus , Thanos , VictoriaMetrics , Terraform , Ansible , Vault , Nginx , Consul , OpenTelemetry . 
- 
💾 Database: PostgreSQL , MySQL , Redis , MongoDB , MariaDB , MSSQL . 
- 
🖥️ Operating system: Windows , MacOS , Linux , Ubuntu , Fedora , Arch Linux , Linux Mint . 
Address: [Address], [City, State, ZIP]
Phone: [Phone Number]
Email: [Email Address]
Github: [Github Address]
I'm a DevOps Engineer, skilled DevOps Engineer with over 3+ years of hands-on experience in designing, building, maintaining, administrating managing large-scale Kubernetes clusters and performance optimizing, cost reduction highly scalable and reliable systems, as well as managing Data Centers across Linux-based environments, networking and services. I specialize in supporting, automating and optimizing mission-critical deployments in cloud platforms like AWS and Azure, leveraging tools for configuration management, CI/CD pipelines, and DevOps best practices. Outside of work, I dedicate my free time to learning new technologies and contributing to the open-source community, driven by my deep passion for Linux and innovation. Seeking a challenging position where I can leverage my expertise in automation, monitoring, incident response, and infrastructure management to ensure the availability, performance, and efficiency of critical applications and services. Highly skilled and motivated Site Reliability Engineer (SRE) with 3 years of experience in designing, building, and maintaining highly scalable and reliable systems. Seeking a challenging position where I can leverage my expertise in automation, monitoring, incident response, and infrastructure management to ensure the availability, performance, and efficiency of critical applications and services.
- [Bachelor's/Master's Degree] in [Computer Science/Engineering/Information Technology]
 [University Name], [Year]
- [0xAWS], [Certifying Organization], [2024]
- [0xAzure], [Certifying Organization], [2024]
- [0xGCP], [Certifying Organization], [2024]
- [0xCKA], [Certifying Organization], [2024]
- [0xCKS], [Certifying Organization], [2024]
- Programming Languages: Shell scripting, Python, Go
- Cloud Technologies: AWS, Azure, GCP, VMware, OVHCloud
- Containerization and Orchestration: Docker, Kubernetes, AKS, RKE, EKS
- Infrastructure as Code (IaC): Terraform, Atlantis, Packer.
- Configuration Management: Ansible, Chef
- Networking/Security: TCP/IP, DNS, Load Balancing, Switch and Routing, IPSec/SSL VPN, Zero Iptables, Fail2ban, Firewalld, CSF, pFsense, CyberArk, Vault.
- SAST: Snyk, SonarQube, GHAS.
- Telecom: Asterisk, FreePBX, CIS AutoDial, dahdi tool E1, GSM gateway.
- Continuous Integration/Continuous Delivery (CI/CD): Jenkins, GitLabCI, Github Action, ArgoCD, FluxCD
- Distributed tracing: Tempo, Jaeger, OpenTelemetry, Sentry, APM Elasticsearch
- Monitoring and Alerting: Prometheus, Grafana, VictoriaMetrics, ELK Stack, LGTM Stack, PLG Stack
- Incident Response and Troubleshooting: PagerDuty, Splunk
- Reliability Engineering: SLA/SLO, Error Budgets
- Collaboration and Communication: Jira, Confluence
- Collaborated with SWE, QC and PO teams to automate and accelerate the processes of build, test, release and deployment of applications into a runtime environment quickly and reliably, improve application performance and reliability through performance tuning, load testing, code optimization.
- Integrated security best practices into CI/CD pipelines by establishing a DevSecOps culture. Automated security scanning and compliance validation with tools like Snyk and Soblew, achieving a significant reduction in vulnerabilities and meeting all regulatory audit requirements.
- Designed and implemented a GitOps-based continuous deployment pipeline leveraging GHA, Jenkins for CI and ArgoCD for CD achieving a acceleration in release cycles and a improvement in code quality through a significant reduction in production bugs, reducing configuration drift and decreasing rollback incidents by across 8+ Kubernetes clusters, ensuring greater stability and reliability in production environments.
- Collaborated with DE and DA teams setup DataOps to accelerate the continuous integration and continuous deployment of Data Engineering solutions using Airflow, Spark, PostgreSQL, ClickHouse, PowerBI. Collaborate with Data Engineers and Data Analyts to setup automated monitoring solution using Prometheus, Grafana to continuously monitor reliability, availability and performance of Data pipelines components and processes. Able to detect incidents in real-time, so that incidents can be resolved in near real-time. Reducing deployment time and increasing release frequency.
- Designed and implemented a comprehensive metrics and centralized observability platform monitoring and alerting solutions using Prometheus, Grafana, Percona, OpenTelemetry for real-time monitoring of application services. Achieved a reduction in incident response time by implementing proactive alerting mechanisms, leveraging ELK Stack, PLG Stack for enhanced logging and analysis. Utilizing Sentry, Elasticsearch APM for improved error exception tracking, APM tracing, log aggregation, and incident diagnostics.
- Designed and implemented multi-node Kubernetes clusters with autoscaling and intelligent resource management using Karpenter and KEDA. Achieved optimized container resource allocation and enhanced system performance while ensuring high availability, fault tolerance, and seamless operation for mission-critical applications.
- Automated and migrated Jenkins pipelines and manifest template deployments for containerized applications into using GitHub Actions (GHA), Helm, and Kustomize, enabling faster and more reliable release cycles.
- Reduced AWS cloud infrastructure costs by 35% by leveraging monitoring, AWS Cost Explorer, Azure Cost Management, optimizing K8s cluster autoscaling mechanisms and leveraging spot instances across AWS and Azure, all while maintaining system stability and performance.
- Led incident response and troubleshooting efforts, ensuring timely resolution of critical incidents and minimizing downtime.
- Implemented infrastructure automation using Terraform and Ansible, reducing manual provisioning time by 70% and improving consistency across environments.
- Designed and built scalable Kubernetes clusters on AWS/GCP for deploying microservices, improving application scalability and fault tolerance.
- Developed and maintained CI/CD pipelines using Jenkins and GitLab CI/CD, enabling automated building, testing, and deployment of applications.
- Implemented monitoring and alerting solutions using Prometheus, Grafana, and ELK Stack, enabling proactive issue detection and reducing mean time to resolution.
- Collaborated with development teams to improve application performance and reliability through performance tuning, load testing, and code optimization.
- Led incident response and troubleshooting efforts, ensuring timely resolution of critical incidents and minimizing downtime.
- Conducted Chaos Engineering experiments to proactively identify system weaknesses and improve resilience.
- Participated in on-call rotations, responding to incidents and performing root cause analysis to prevent recurrence.
- Collaborated with R&D teams to develop and maintain CI/CD pipelines using GitLabCI, GHA. Integrated automated building, testing, code scanning, release and deployment of applications into a runtime environment quickly and reliably, reducing build and deployment times. Orchestrated the migration of Dockerized services/web applications to Kubernetes (RKE).
- Designed and implemented a GitOps workflow for vCenter On-Premises infrastructure, leveraging GitLab CI/CD, Packer, Terraform and Ansible. Automated the provisioning, configuration, and management of virtual machines, reducing manual intervention and deployment times while ensuring consistency and scalability across the environment.
- Collaborated with Security and Incident Response teams to implement and system maintain, system hardening, mitigation, hotfixes, patches, updates security controls and ensure compliance with industry standards.
- Implemented a robust monitoring and alerting system to ensure system availability and reliability using GitLab CI/CD, Prometheus, and Grafana, resulting in decreased system downtime and faster issue resolution. Developed applications to proactively identify and address issues. Implemented centralized logging and log analysis using PLG Stack and ELK Stack, enhancing troubleshooting and monitoring capabilities.
- Collaborated with cross-functional IT teams (Network, Support, Development, Security) to deploy, secure, and optimize IT systems across development, testing, and production environments. Designed, configured, and deployed tools such as VMware, GitLab, Jira to enhance team productivity and improve system reliability in an On-Premises infrastructure and GCP infrastructure (Compute Engine, Cloud Storage , Cloud SQL).
- Managed infrastructure on AWS, including EC2, S3, RDS, and VPC, ensuring high availability, scalability, and security.
- Automated infrastructure provisioning and configuration using Terraform and Ansible, reducing deployment time by 50% and improving infrastructure consistency.
- Implemented centralized logging and log analysis using ELK Stack, improving troubleshooting and monitoring capabilities.
- Worked closely with development teams to implement performance monitoring and optimization strategies.
- Collaborated with security teams to implement and maintain security controls and ensure compliance with industry standards.
- Conducted disaster recovery planning and testing exercises to ensure business continuity.
- Managed and maintained services/web applications using Docker containerization. Led the migration of services/web applications to Docker, ensuring seamless integration. In the next phase, Orchestrated the migration of legacy monolithic applications to a containerized architecture, leveraging Kubernetes (Kubeadm), enhancing scalability and operational efficiency.
- Implemented infrastructure automation GitOps workflow for VMware On-Premises infrastructure using GitLabCI, Packer, Terraform and Ansible, RunDeck, reducing manual provisioning time and improving consistency across environments.
- Collaborated with cross-functional teams to design and implement network, system, and storage infrastructures. Partnered with development teams to deploy and manage development, staging, and production environments using GitLab CI/CD. Successfully supported multiple frameworks, including LAMP, LEMP, PHPFox, and WordPress, ensuring seamless integration and optimal performance.
- Building, managing, operating, monitoring and development, research, evaluation and selection of solutions for network systems and service infrastructure (Prometheus & Grafana, Loki, Zabbix). Implemented centralized logging and log analysis, network performance and analyze network traffic (ELK Stack) send alerts to Telegram and Jira, improving troubleshooting and monitoring capabilities, resulting in a decrease in system downtime and faster issue resolution.
- Managed and maintained IT software infrastructure, including upgrades, mitigations, hotfixes, patches, and security controls for both On-Premises environments and AWS infrastructure (EC2, S3, RDS). Designed and implemented robust backup and disaster recovery policies to ensure the durability and availability of company systems.
- Installed, configured, and maintained development, staging and production environments for over 400+ virtual machines on VMware vCenter within a Data Center environment. Managed and maintained a network infrastructure consisting of over 100 Cisco and Nexus devices, ensuring optimal performance, security and reliability. Responsible for configuring systems, installing, deploying and administering services on Linux-based operating systems and network devices, ensuring they meet organizational requirements and directives from superiors.
- Designed and maintained a robust monitoring, alerting system using Zabbix, PRTG resulting in a decrease in network, system downtime and faster issue resolution. Monitor the IT infrastructure of the Company: Server system, backup system, config switch, config router, camera, firewall, internet connection, application, software system.
- Collaborated with SWE teams to develop and maintain automated using Bash, Ansible release product to production environment. Develop and maintain the end-to-end CI/CD pipeline using Bash, Ansible reducing install, build and deployment time.
- Experience with installing, configuring, setup, call routing, voicemail management, managing, support, troubleshooting tools and techniques for diagnosing and resolving network and voice quality issues. Engineered and deployed scalable VOIP solutions for over 3000 users within the organization. Implemented a VOIP system for a new call center, enabling the support for an additional 7000 concurrent calls. Migration of traditional telephony systems to a centralized VOIP platform. Design and maintain a robust monitoring VOIP system calls, alerting system using Prometheus, Grafana.
- Database Administrator (MariaDB, MySQL): Managed and optimized standalone and clustered environments, implemented replication, automated backups, tuned performance, ensured high availability, and support query data visualization reports to leader.
- Project Name: Implemented a comprehensive observability solution using Prometheus and Grafana, providing real-time monitoring and alerting for critical applications and services.
- Project Name: Led the migration of legacy infrastructure to a containerized architecture using Kubernetes, resulting in improved scalability and reduced operational overhead.
Available upon request