Περιγραφή θέσης εργασίας
Reference Number: SRE1022 The Role:
As our new Site Reliability Engineer, you will join a team who deal with cutting-edge technologies every day and strive to utilize cloud services to the maximum limit. You will be monitoring critical applications and services to minimize downtime and ensure their availability and that the underlying infrastructure is running smoothly and systems and tools are working as expected. You will be working closely with developers to help with troubleshooting and provide consultation when alerts are issued. The main responsibilities of the position include: Monitor critical application metrics and create alerts Build/Use software to help DevOps, Devs & Support teams Fix support escalation issues Maintain documentation and runbooks Conduct post-incident reviews Main requirements: 2+ years of experience in a similar role Experience in incident, problem and change management practices Extensive working experience with CI/CD procedures and tools (e.g. Gitlab CI) Strong experience using containers and Kubernetes Experience with Infrastructure as Code (Terraform, CloudFormation) Ability to work as part of a distributed team Experience with monitoring tools (Prometheus, Grafana, New Relic) The following will be considered an advantage: Familiarity with database concepts Working experience with at least one cloud provider (preferably AWS) Apache Kafka configuration and troubleshooting ELK configuration and troubleshooting Scripting skills (Bash, PowerShell, Python, Go, etc) Benefit from: Attractive remuneration package Private health insurance Corporate pension fund Food allowance Intellectually stimulating work environment Continuous personal development and international training opportunities
Type of employment: Full time
Location: Cyprus or Greece
All applications will be treated with strict confidentiality!