Site reliability engineer job offer

Site Reliability Engineer

30 Sep 2024

Georgia, Atlanta, 30301

Site Reliability Engineer

Vacancy expired!

The Site Reliability Engineer will drive cross-team initiatives that improves Delta engineering practices through increased accountability and deliver increased uptime and performance for the business. An ideal candidate would have prior experience implementing observability plans around logs, metrics, and traces.

What you'll be doing?

Engage in and improve the whole lifecycle of services-from inception and design through deployment, operation and refinement
Support capacity planning, availability, scalability, security and latency considerations for new infrastructure and service provisioning as appropriate
Responsible for improvements to end-to-end availability and performance of mission critical services and build automation to prevent problem recurrence.
Strong experience setting SLOs / SLIs / error budgets and managing of reliability for infrastructure and applications
Partner with other SREs to bring best practices or learnings from across the organization to them
Scale and optimize existing infrastructure and services sustainably through mechanisms, including automation, and evolve them by improving reliability and efficiency
Manage end-to-end availability and performance of mission-critical services and build automation to prevent problem recurrence
Maintain infrastructure and services by measuring, and monitoring system metrics to proactively identify operational efficiencies, potential outages and security threats in Development, UAT, Staging and Production environments
Practice sustainable incident response and blameless postmortems
Build infrastructure and drive projects that break things with the aim to improve the robustness of production systems
Use the core Site Reliability Engineering principles of change management, monitoring, emergency response, capacity planning, and production readiness reviews to run the platform
Step back to observe patterns and develop innovative tools and automation to eliminate or minimize menial tasks. Use those learnings to drive the best operational practices
Develop and maintain solution and operational documentation and designs for all infrastructure and services within the scope of SRE
Preserve operational visibility and response capabilities - fixing and improving our dashboards, alerts, and automation

What you need to succeed (minimum qualifications)

Proficient in one or more of the following scripting languages: JavaScript, Nodejs, Python, Ansible, Bash, etc.
Experience handling large numbers of diverse systems with configuration management systems like Puppet, Chef, Ansible
Proven history of toil elimination by leveraging automation
Strong background using tools like PagerDuty for managing incidents
Strong experience with monitoring and alerting systems like Prometheus, Grafana, Datadog.
Understanding of standard networking protocols and components such as HTTP, DNS, ECMP, TCP/IP, ICMP, the OSI Model, Subnetting and Load Balancing strategies
Experience in Serverless Application Framework
Experience in containerized workloads and management platforms such as Docker or Kubernetes
Familiarity with distributed systems is a plus including Microservices
Experience in Infrastructure automation tools such as CloudFormation, Terraform
Understanding of CI/CD processes and experience with deployment automation tools such as Code Pipeline, Code Deploy, Jenkins, Bamboo
Strong debugging, troubleshooting, and problem-solving skills
Effective communication, collaboration & negotiation skills with the ability to interface with various business units and third parties
Experience liaising with developers, operations staff and third-party resources
Experience with API integration projects
5+ years of total software engineering experience
2+ years support a production system on a DevOps team
2+ years of experience running and building systems in cloud platforms such as Amazon Web Services, Google Cloud or Microsoft Azure
Where permitted by applicable law, must have received or be willing to receive the COVID-19 vaccine by date of hire to be considered for U.S.-based job, if not currently employed by Delta Air Lines, Inc.

What will give you a competitive edge (preferred qualifications)

Bachelors Degree in Computer Science, Information Systems or related technical field.
Experience working in an airline technology environment.

A career at Delta not only gives you a chance to see the world, we provide excellent benefits to help you keep climbing along the way!

Competitive salary, industry leading profit sharing and 401(k) with generous direct contribution and company match
Comprehensive health benefits including medical, dental, vision, short/long term disability and life benefits
A detailed wellness plan that recognizes the importance physical, emotional, financial and social wellbeing
Domestic and International flight privileges
Career development programs are available for your long-term career goals

Related jobs

Staff Site Reliability Engineer - Observability

Georgia, Atlanta, Et cetera

Fastly helps people stay better connected with the things they love. Fastly’s edge cloud platform enables customers to create great digital experiences quickly, securely, and reliably by processing, serving, and securing our customers’ applications as close to their end-users as possible — at the edge of the Internet. The platform is designed to take advantage of the modern internet, to be programmable, and to support agile software development. Fastly’s customers include many of the world’s most prominent companies, including Vimeo, Pinterest, The New York Times, and GitHub.

More info...
Site Reliability Developer

Georgia, Atlanta, Et cetera

Job Description

More info...
DevOps Engineer

Georgia, Atlanta, Et cetera

AWS DevOps Specialist - Solution Specialist

More info...
Senior Applications Engineer - RGIU (Omni Channel)

Georgia, Atlanta, Et cetera

Job Description

More info...
Software Engineer - .Net Core, AWS and React.js [See Locations]

Georgia, Atlanta, Et cetera

Thank you for your interest in a career at Regions. At Regions, we believe associates deserve more than just a job. We believe in offering performance-driven individuals a place where they can build a career - a place to expect more opportunities. If you are focused on results, dedicated to quality, strength and integrity, and possess the drive to succeed, then we are your employer of choice.

More info...
Sr Engineer - Federal

Georgia, Atlanta, Et cetera

About Lumen

More info...
Test Automation Engineer - Hybrid

Georgia, Atlanta, Et cetera

Our Fortune 500 company is driving a digital transformation and looking for forward-thinking innovators to disrupt how our industry thinks about and uses technology. As one of the world\'s leading employee benefits providers, we help millions of people gain affordable access to benefits that help them protect their families, their finances and their futures.

More info...

Job Details

ID

JC46112482
State

Georgia
City

Atlanta
Job type

Permanent
Salary

N/A
Hiring Company

Delta Air Lines Inc.
Date

2022-09-29
Deadline

2022-11-27
Category

Et cetera
Print

Site Reliability Engineer

Site Reliability Engineer

Site Reliability Engineer

Related jobs

Staff Site Reliability Engineer - Observability

Site Reliability Developer

DevOps Engineer

Senior Applications Engineer - RGIU (Omni Channel)

Software Engineer - .Net Core, AWS and React.js [See Locations]

Sr Engineer - Federal

Test Automation Engineer - Hybrid

Job Details

Navigation

Vacancies