Key Responsibilities: - Design and implement scalable, reliable infrastructure for financial platforms
- Develop automation tools for deployment, monitoring, and incident response
- Define and maintain SLIs/SLOs and manage error budgets
- Lead root cause analysis and post-mortem processes for production incidents
- Collaborate with engineering teams to embed reliability into the software development lifecycle
- Ensure compliance with regulatory and security standards in system operations
Requirements: - 3-8 years of experience in SRE, DevOps, or infrastructure engineering
- Strong proficiency in cloud platforms (AWS, Azure, or GCP)
- Experience with container orchestration (Kubernetes, Docker)
- Familiarity with monitoring tools (Prometheus, Grafana, ELK, etc.)
- Solid scripting skills (Python, Bash, Go, etc.)
- Understanding of CI/CD pipelines and infrastructure as code (Terraform, Ansible)
- Knowledge of financial systems, regulatory requirements, or high-availability architectures is a plus.
What you need to do now
If you're interested in this role, click 'apply now' to forward an up-to-date copy of your CV.