Site Reliability Engineer - 65K - London
Working with a leading digital sports media company, Handle is recruiting a Site Reliability Engineer.
Your new roleAs a Site Reliability Engineer you will ensure millions of sports fans worldwide can engage with company products and applications. You will support our engineering teams to deliver highly available and performant solutions.Reporting to the Site Reliability Lead, you will be part of our Site Reliability and DevOps team and have the opportunity to utilise a wide range of technologies and tools.You will work with a wide range of delivery squads and engineering teams around the business to provide them with the support, tooling and knowledge to achieve great results. Ultimately, you will be passionate about the quality of software developed. Your aim will be to ensure the systems the business develops are highly available, low latency, robust to unexpected failures, scalable to high levels of load, cost effective and secure.
Your key responsibilities
- Working with engineering teams to help plan and deliver solutions, ensuring they are highly performant, reliable and secure.
- Developing tooling and libraries and investigating new approaches and technologies to support our development teams gain improved observability, performance, reliability and security.
- Proactively monitoring performance and reliability of systems and helping the Site Reliability team to define acceptable standards for key metrics. Identifying necessary improvements and working with developers to deliver them.
- Gathering data and presenting it to the wider business to help share understanding of the reliability and performance of our systems.
- Documenting our tooling and best practices for both technical and non-technical audiences.
- Supporting the technical response to outages and incidents, and designing and implementing improvements to our systems to prevent recurrence.
- Working closely with the DevOps team to ensure the infrastructure to run our systems is in place, that it is secure, and that we have the means to ship code in a safe, reliable and continuous manner.
- Microservices and Cloud-based architecture
- GIT usage and code management
- Documenting solutions and code
- Security Principles
- Monitoring and Logging tools and frameworks (e.g. Prometheus and Grafana for metrics, Fluentd, Elasticsearch and Kibana for logs)
- Databases (Most of our services access MySQL databases via JPA and Hibernate)
- AWS (e.g. Cloudfront, Route 53, RDS, DynamoDB, Lambda, API Gateway, EKS, EC2, IAM)
- Containerised systems (Docker, Kubernetes)
Benefits & Wellness
- Flexible & Remote Working
- Buy and Sell Annual Leave
- Pension Scheme
- Sporting Events and Tickets
- Health & Wellness Activities
- Complimentary Headspace Access
- Annual Wellness Contribution
- Mobile Phone Contribution
Handle actively welcomes applicants from under-represented backgrounds - we pride ourselves on attracting the best talent for every opportunity through a commitment to equality, diversity and inclusion.