Site Reliability Engineer - DevOps - (Manchester Remote)

London, United Kingdom
14 May 2023
28 May 2023
Job Function
Industry Sector
Finance - General
Employment Type
Full Time
Codified Controls - S ite Reliability Engineer - DevOps

About The Department

Developer Engineering is a new function within ICG (Institutional Clients Group) Technology. Our mission is to make it easy and enjoyable for software engineering teams to go from a business idea to delivering an innovative product solution. The main goals are to improve and upgrade our tools, streamline our processes, automate, and strengthen our controls, and help development teams adopt modern working methods.

This new initiative represents a critical investment in our future development capability. We are building an expert team to transform the working environment of the 18,000 people who make up the ICG development community and help them dramatically change their approach to developing software across the many different technologies we support. The Developer Engineering group has a challenging remit, but with the scale and variety comes a unique opportunity to be agents of cultural and technical change who significantly impact the bank.

About The Team

Within the ICG Developer Engineering department, the Codified Controls group is a dedicated expert team focused on driving the everything-as-code agenda and delivering tangible reductions in process friction, errors, and manual effort to comply with and administer our policies, standards and the controls that derive from them .

The Codified Controls group is build ing solutions and platforms, break ing down technical barriers and strengthen ing existing systems. The group ha s the mandate and a unique opportunity as part of a greenfield programme to impact critical technical decisions across the company and change how the organisation is applying controls and writing policies globally.

The Codified Controls is a cross-functional team of engineers, data scientists, business analysts, and product managers that will directly engineer new codified controls and the underlying capabilities required to support them and drive other groups to provide the automated controls.

They are also cultural and behavioural leaders who are champions of great teamwork and highly opinionated on human-centric approaches to automating and codifying our controls and procedures.

The Role

We are seeking a skilled Site Reliability Engineer (SRE) to join our growing product team. The ideal candidate is passionate about ensure the stability, reliability and performance of our web applications and services. This role requires someone with a background in software development and engineering , experience working in product teams and a deep understanding of site reliability engineering principles. You will be responsible for maintaining and improving the reliability of our systems while collaborating with our product, development, and operations teams to optimize the user experience.

  • Implement, maintain and improve monitoring, logging and alerting systems to ensure high levels of system reliability, availability, and performance in accordance with our Service Level Objectives (SLOs)
  • Collaborate with product and engineering teams to design, build and maintain scalable and reliable web applications and services
  • Analyse system performance and proactively identify areas for improvement, optimisation, and capacity planning
  • Develop and maintain automation tools and frameworks for deploying, managing, and monitoring infrastructure and applications
  • Participate in incident management, root cause analysis and resolution processes, and implement corrective measure to prevent recurrence
  • Advocate for and incorporate the best practices in site reliability engineering, including CI/CD, infrastructure as code and automated testing
  • Stay up to date with underlying infrastructure services and products offered by partner teams and represent our team in the improvement and adoption of those services
  • Strong problem-solving and critical thinking skills, with a keen attention to detail
  • Excellent communication and collaboration skills
  • Pragmatic, and a creative approach to managing risk
  • An advocate of inclusion and diversity in every way
  • A growth mindset and willingness to learn and adapt in a fast-paced environment
  • Passionate about site reliability engineering and its impact on product development
  • A self - starter with the ability to work effectively in teams and remotely
Experience :
  • Demonstrated experience as a Site Reliability Engineer, DevOps Engineer, or similar role
  • Strong coding skills in at least one programming language (We use Python, JavaScript , and Go within our team, and lots of Java throughout the bank)
  • Experience working in product teams and collaborating with developers, product managers and other stakeholders
  • Solid understanding of site reliability engineering principles, including monitoring, alerting, incident management and root cause analysis
  • Hands-on experience with logging and alerting systems such as ELK Stack (Elasticsearch, Logstash, Kibana), Grafana, Prometheus or Splunk
  • E xperience with Red Hat OpenShift, Kubernetes, and container orchestration , and their as-a-code management tools such as Helm, kubectl etc.
  • Familiarity with database and data technologies (We use MongoDB and Kafka)
  • Proficiency with CI/CD processes, tools , and best practices
Job Family Group:
Technology -------------------------------------------------
Job Family:
Applications Development ------------------------------------------------------
Time Type:
Full time ------------------------------------------------------
Citi is an equal opportunity and affirmative action employer.

Qualified applicants will receive consideration without regard to their race, color, religion, sex, sexual orientation, gender identity, national origin, disability, or status as a protected veteran.

Citigroup Inc. and its subsidiaries ("Citi") invite all qualified interested applicants to apply for career opportunities. If you are a person with a disability and need a reasonable accommodation to use our search tools and/or apply for a career opportunity review Accessibility at Citi .

View the " EEO is the Law " poster. View the EEO is the Law Supplement .
View the EEO Policy Statement .
View the Pay Transparency Posting
  • You need to sign in to save