Cloud Site Reliability Engineer

Cloud Site Reliability Engineer

02 Nov 2021
New Jersey, Jerseycity, 07302 Jerseycity USA

Cloud Site Reliability Engineer

job summary:

Description:



  • We are seeking to fill Cloud SRE position to help with Internal Cloud, Public Cloud (Azure /AWS) and Containers (Openshift/Dockers).
  • Candidates must have 5+ years of experience working with Unix/Linux Server platforms.
  • Must be extremely proficient in Shell scripting /Python/Ansible scripting. Experience with Ansible Tower administration/support.
  • Must have experience with whole lifecycle of cloud services-from inception and design, through deployment, operation and support.
  • A successful candidate must have hands on experience and able to provide on-call support. They should be able to work stand-alone and with a distributed team.

Responsibilities:



  • Responsible for reliability and support of Internal Cloud, Public Cloud (Azure /AWS) and Containers (Dockers) services.
  • Responsible for Ansible Tower Administration activities (like manage, upgrade, support Ansible Tower deployment)
  • Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
  • Troubleshoot issues across the entire stack: hardware, software, application and network
  • Perform deep dives into both systemic and latent reliability issues; partner with engineering and operation teams across the organization to produce and roll out fixes.
  • Drive standardization efforts across multiple disciplines and services in conjunction with embedded SREs throughout the organization.
  • Identify and drive opportunities to improve automation for the cloud services
  • Scope and create automation for deployment, management and visibility of our services.
  • On-call coverage requirements and support break-fix needs when required

Required Qualifications



  • BS /MS degree in Computer Science or related technical field involving systems or equivalent practical experience.
  • Minimum 5+ years working with Linux /Windows operating system
  • Minimum of 5+ years scripting in Python/Shell/Bash/Ksh.
  • Experience with Ansible Tower Administration activities (like install, setup, manage, upgrade, support Ansible Tower deployment)
  • Experience with Docker, Kubernetes, openshift
  • Experience with Vmware tools, AWS, Azure cloud and Hypervisor technologies (esxi, kvm, xen)
  • Experience with Sql/NoSql databases like Mysql, mongodb
  • Experience with CI/CD tools git /Jenkins
  • Experience with Terraform /Consul /Nomad/Vault is a plus.
  • Experience creating and maintaining complex data-driven automations and queries using SQL and noSQL databases.
  • Proven ability to work independently with minimal supervision and as part of a team with direct responsibilities.
  • Experience with TCP/IP, routing, DNS, Active Directory, Kerberos, DMZ etc.
  • Ability to juggle competing priorities and adapt to changes in project scope.
  • Ability to communicate and collaborate effectively with teammates and internal clients.
  • Effective verbal and written communication.

Top 3 Must Have Skillsets Required :



  • Minimum 5+ years working with Linux /Windows operating system
  • Minimum of 5+ years scripting in Python/Shell/Bash/Ksh.
  • Experience with Ansible Tower Administration activities
  • Level of Experience Needed 5+ Years



location: Jersey City, New Jersey

job type: Contract

salary: $60 - 70 per hour

work hours: 9am to 5pm

education: Bachelors



responsibilities:


  • Responsible for reliability and support of Internal Cloud, Public Cloud (Azure /AWS) and Containers (Dockers) services.
  • Responsible for Ansible Tower Administration activities (like manage, upgrade, support Ansible Tower deployment)
  • Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.
  • Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
  • Troubleshoot issues across the entire stack: hardware, software, application and network
  • Perform deep dives into both systemic and latent reliability issues; partner with engineering and operation teams across the organization to produce and roll out fixes.
  • Drive standardization efforts across multiple disciplines and services in conjunction with embedded SREs throughout the organization.
  • Identify and drive opportunities to improve automation for the cloud services
  • Scope and create automation for deployment, management and visibility of our services.
  • On-call coverage requirements and support break-fix needs when required





qualifications:


  • Experience level: Experienced
  • Minimum 5 years of experience
  • Education: Bachelors


skills:
  • R/Python
  • Shell
  • AWS



Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.

Related jobs

Job Details

  • ID
    JC5438508
  • State
  • City
  • Job type
    Contract
  • Salary
    USD $60 - 70 per hour 60 - 70 per hour
  • Hiring Company
    Randstad Corporate Services
  • Date
    2020-11-02
  • Deadline
    2021-01-01
  • Category

Jocancy Online Job Portal by jobSearchi.