Senior Site Reliability Engineer Salary: $120k-$140k + Bonus Location: Hybrid role in either location Chicago, IL / Dallas, TX We are unable to provide sponsorship for this role Bonus Eligible
Qualifications
Bachelor’s or Master’s Degrees in Computer Science, Information Systems or other related field. Or equivalent work experience.
Minimum of 5-8 years of experience in Site Reliability Engineering / DevOps
Experience managing infrastructure in public cloud environments like AWS (preferred), Azure or Google Cloud Platform
Experience providing visibility using monitoring and alerting tools like Splunk, SignalFx, AppDynamics, Datadog, StackDriver, Sysdig, Prometheus or Grafana
Programming/scripting experience in languages like Java, Bash, Python or Go
Experience with distributed messaging systems like Kafka, RabbitMQ, or ActiveMQ
Experience with container orchestration systems like Kubernetes, Mesos, Docker Swarm or Rancher
Experience with using Continuous Integration and Continuous Delivery (CI/CD) tools like Jenkins, Travis, Harness, Spinnaker, Appveyor, CodeBuild or CodePipeline.
Responsibilities
Collaborate with development, operations, and infrastructure teams to ensure availability of services, and to work through implementation issues.
Develop automation for incident response and to prevent problem recurrence
Create and enhance runbooks to respond to service outages or degradations
Assess the production readiness of services
Define and track operational metrics for production performance, reliability, scalability, and availability
Architect, develop and maintain shared services and tools to improve reliability and reduce toil across the organization
Contribute to the team’s continuous improvement through research, retrospectives, discussion groups and code reviews
Provide leadership within the team by guiding and mentoring junior members, and preparing stories for the sprint backlog