Our Customer: A fast-growing international technology company focused on building scalable digital platforms that improve everyday experiences for millions of users worldwide. The team operates in a high-load, cloud-native environment, emphasizing reliability, performance, and continuous innovation.
Your tasks: * Design and develop platform-level solutions using Go and Python * Drive and evolve observability practices across systems (logs, metrics, tracing) * Investigate and resolve issues in distributed production environments * Build and maintain monitoring, alerting, and telemetry infrastructure * Implement proactive performance and saturation metrics using PromQL, SQL, and Lucene * Work with Infrastructure as Code tools such as Terraform, Terragrunt, and CloudFormation * Automate CI/CD and operational workflows using GitLab, Jenkins, Bash, Python, and AWS services * Lead and coordinate cross-team technical initiatives * Create scalable solutions without fully predefined requirements, taking ownership of design decisions
Required experience and skills: * 7+ years of experience in DevOps, Platform Engineering, or Cloud Infrastructure roles * Strong programming background with at least two languages (Go, Python, Java, C, or JavaScript) * Solid understanding of observability concepts and modern monitoring approaches * Hands-on experience with AWS, including networking and architecture * Deep understanding of distributed systems and production troubleshooting * Experience with Kubernetes and container orchestration concepts * Familiarity with OpenTelemetry and tools like Grafana * Experience working with Prometheus, Elasticsearch, and VictoriaMetrics * Knowledge of the software development lifecycle and Test Driven Development practices * Strong analytical thinking and data-driven decision-making mindset * Ability to work independently and manage priorities effectively * Upper-intermediate or higher English (C1)
Would be a plus: * Experience with GitOps practices and tools such as Argo CD * Background in building internal platforms or developer tooling * Experience in high-scale or performance-critical environments * Strong system design and architectural skills * Proven ability to simplify complex problems and deliver practical solutions
Working conditions: * 5-day working week, 8-hour working day * Remote work