Site Reliability Specialist

Site Reliability Specialist

15 Apr 2024
Virginia, Richmond, 23173 Richmond USA

Site Reliability Specialist

Vacancy expired!

Site Reliability Specialist
Requisition 267994
Location: Open

The Richmond Fed is the proud home of the Federal Reserve's National IT organization-a nationwide team delivering technology solutions and support across the Federal Reserve System. Many National IT employees are located in Richmond, while others are based across the U.S. at other Federal locations.
When you join our team, you'll become part of a culture that welcomes differences, cares about our communities, and empowers each other to lead from where we are to make things better.
Bring your passion and we'll provide challenging and purposeful careers in a variety of fields, opportunities to grow and a wide range of benefits and perks that support your health and wealth. It's all part of what makes #MyRichmondFed a great place to work!

About the Opportunity
The FedNow department has an immediate opening for a Site Reliability Specialist, reporting to the Assistant Vice President.
As a Site Reliability Specialist, you will be part of the Technical Operations (TechOps) department that has the overall responsibility for the design, management and execution of operations required to support the ongoing technical and delivery needs of the infrastructure for the FedNow Program, as well as the transition to production support and operations. This team interfaces with internal stakeholders, customers for planning, delivery, and service management. It owns ongoing ITIL processes and the implementation and driving of continuous improvement initiatives. You will architect, implement, and leverage monitoring and tooling to be used for capacity planning, utilization reporting, and scaling. The ideal candidate is someone who loves building and maintaining reliable and scalable systems, CI/CD tooling, and automating cloud-based highly available high performing applications.

What You Will Do:

  • Provide technical functional expertise to the Architecture, Engineering, DevOps, and QA teams
  • Leverage SRE best practices - own responsibility for the availability and performance of the cloud infrastructure/platform.
  • Focus on solving problems through software
  • Define SLIs/SLOs
  • Work with CI and CD tools, and source control such as GIT and SVN
  • Implement Performance monitoring and capacity management - detect and automatically resolve
  • Lead the team through continuous improvement of production operations
  • Offer technical support where needed and developing automation software to speed incident resolution
  • Building and maintaining tools, services, and automations associated with deployment and operations platform
  • Actively troubleshoot any issues that arise in production with the goal of providing permanent fixes - conduct root cause analysis of problems to prevent future occurrence
  • Maintain effective knowledge base and documentation
  • Drive innovation and platform evolution - identify potential breakdowns and drive improvements
  • Automate our operational processes as needed, with accuracy, and compliant with security standards.
  • Be a champion of operational excellence
  • Develop and maintain health dashboards
  • Provide rotational on-call support


Qualifications:
Expertise you will bring
  • Extensive knowledge and understanding of working in AWS environments & services
  • Familiarity with basic networking, security and cloud engineering concepts
  • Experience supporting infrastructure for large multi-services applications.
  • Proficiency in scripting languages.
  • Experience with Performance tools
  • Experience working with configuration management tools
  • Working knowledge of databases
  • Ability to develop and maintain environment documentation and support procedures
  • Knowledge of technology project and secure coding standards.
  • SRE experience on the on-premise and cloud technologies
Education and Experience Requirements
  • Bachelor's degree in computer science or computer engineering
  • Minimum 10 years of hands-on experience in application and technical support role
  • Minimum 5 years of SRE experience
  • Minimum of 5 years hybrid cloud infrastructure experience

Discover the Reason Why So Many People Love It Here!
When you join Federal Reserve's National IT organization, not only will you find a challenging and purposeful career, you'll also have access to a wide range of benefits and perks that support your health and wealth, including:
  • Great medical benefits
  • Pension and 401(k) with employer match
  • Generous paid time off
  • Tuition reimbursement
  • Employee resource networks
  • Paid volunteer leave
  • Flexible work options
  • Onsite amenities that make working here fun
Other Requirements and Considerations:
  • Candidates should review the Bank's Employee Code of Conduct to ensure compliance with conflict of interest rules and personal investment restrictions. The Code is available on the About Us, Careers webpage at www.richmondfed.org .
  • Sponsorship is not available for this role.
  • Selected candidate is subject to special background check procedures.
  • Salary offered will be based on the job responsibilities and the individual's knowledge, skills, and experience as defined in the job qualifications/experience.
The Federal Reserve Bank of Richmond provides equal opportunity to all individuals without regard to race, sex, color, religion, gender identity or expression, sexual orientation, national origin, age, disability, or genetic information.

Job Details

Jocancy Online Job Portal by jobSearchi.