Site Reliability Engineer (SRE)

Site Reliability Engineer (Python Preferred)We are looking for a skilled and adaptableSite Reliability Engineer (SRE)to join our team. This role is a blend ofscripting and operational responsibilities, ideal for someone who enjoys both building automation and engaging in hands-on support to ensure system reliability and performance.London hybrid working - Contract Opportunity - 3 days in BatterseaMust have''sPython scripting - They could take someone with GoAutomation experiencePrometheus / grafana / Prom QLCI/CDAWSSplunkKey ResponsibilitiesDevelop and maintain automation scripts, primarily inPython(Go experience also considered).Respond to and resolveincidents, managechanges, and performproblem analysisto maintain system uptime and reliability.Collaborate with internal teams and customers to troubleshoot and resolve infrastructure and application issues.Operate and enhance observability tooling, includingPrometheus,Grafana, andSplunk, with a strong focus onPromQL.Participate in anon-call rotationto support critical production systems.Improve and maintainCI/CD pipelinesand deployment processes.Work withAWS cloud infrastructureto support scalable, secure, and resilient systems.Operate within aGitOpsworkflow and supportKubernetes-based environments.Required Skills and ExperienceStrong scripting skills inPython(Go, Bash, or SQL also beneficial).Proven experience withautomationand infrastructure-as-code practices.Deep understanding ofmonitoring and observability, particularly wit
Other jobs of interest...

Perform a fresh search...
-
Create your ideal job search criteria by
completing our quick and simple form and
receive daily job alerts tailored to you!