Senior Site Reliability Engineer

Senior Site Reliability Engineer

28 Oct 2025
South Carolina, Capetown 00000 Capetown USA

Senior Site Reliability Engineer

What you’ll doLead incident response and postmortems, drive investigations, document learnings, and implement permanent fixes to prevent recurrence.Manage and optimize Azure Kubernetes environments, own cluster configurations, performance, cost control, and security best practices.Build observability systems, develop dashboards, alerts, and metrics using Dynatrace, Honeycomb, ElasticSearch, Grafana/Kibana, and Azure Monitor (KQL).Automate for resilience, write reliable scripts in PowerShell, Bash, Python, or C#, embedding logging, rollback, and version control.Implement Infrastructure-as-Code, design and maintain Terraform, Bicep, or ARM templates to standardize and automate deployments.Optimize system performance, identify bottlenecks through deep monitoring, dump analysis, and right-sizing of cloud resources.Collaborate across engineering teams, integrate reliability principles into CI/CD pipelines and the broader SDLC.Participate in on-call rotations, lead during critical incidents, ensuring lasting fixes and operational excellence.

Related jobs

Job Details

Jocancy Online Job Portal by jobSearchi.