Site Reliability Engineer

Grade
SEO
Download
Word document

We are looking for a Site Reliability Engineer to join our community of digital specialists, to help deliver more great services.

You’ll be part of a team that shares a vision of making public services digital by default, simpler, clearer and faster to use. As a Site Reliability Engineer, you will be using the latest technologies and trends, whilst delivering working software early and often. Working as part of a multi-disciplinary team, you’ll develop your skills to build a career as a Site Reliability Engineer. You will be helping to build digital services for a diverse set of users, including citizens, teachers, social workers and school professionals.

The technologies you will be using as a Site Reliability Engineer in DfE include: Docker, Linux, Git, GitHub actions, Azure, Azure DevOps, Ruby, Ruby on Rails, Powershell, Terraform, Prometheus, Grafana, ELK stack

You will:

  • Be part of a team that runs and supports Government digital services for teachers
  • Help automate tasks, deployments, and tests by creating infrastructure as code
  • Implement resilient, highly available systems
  • Implement modern software development practices, such as CI/CD and DevOps, as well as modern development workflows using GitHub and Azure DevOps
  • Work in a fully Agile environment
  • Use development skills to maintain applications as well as create powerful automation and monitoring scripts
  • Use infrastructure skills to deploy and integrate services in the cloud
  • With the support of senior SREs and the wider community, learn to build secure, reliable and scalable systems, automate processes to increase delivery efficiency, assist developers to troubleshoot live systems
  • Share knowledge of tools and techniques with the wider team and community, both developers and non-developers
  • Be part of a diverse, inclusive culture across the development community, growing awareness, inclusivity, and balance.

You’ll have:

Essential:

  • Experience in software development or scripting, ideally with Ruby, Bash, Powershell or similar
  • Experience troubleshooting web applications
  • Experience with Linux or other Unix based operating systems
  • Experience building, troubleshooting and automating applications in public cloud based systems
  • Basic understanding of networking
  • Enthusiasm to learn and share knowledge and work collaboratively in an inclusive and diverse multi-disciplinary team environment.

Desirable:

  • Experience using version control (ideally with Git)
  • Experience in analysing systems performance and configuration
  • Experience building, running, optimising Docker container images

Desirable criteria will only be assessed in the event of a tie break situation to make an informed decision

Technical skills:

We'll assess you against these technical skills during the selection process:

  • A pragmatic approach to troubleshooting
  • Knowledge of Linux command line
  • Knowledge of public clouds
  • Programming logic