We are seeking to fill Cloud SRE position to help with Internal Cloud, Public Cloud (Azure /AWS) and Containers (Openshift/Dockers).
Candidates must have 5+ years of experience working with Unix/Linux Server platforms.
Must be extremely proficient in Shell scripting /Python/Ansible scripting. Experience with Ansible Tower administration/support.
Must have experience with whole lifecycle of cloud services-from inception and design, through deployment, operation and support.
A successful candidate must have hands on experience and able to provide on-call support. They should be able to work stand-alone and with a distributed team.
Responsibilities:
Responsible for reliability and support of Internal Cloud, Public Cloud (Azure /AWS) and Containers (Dockers) services.
Responsible for Ansible Tower Administration activities (like manage, upgrade, support Ansible Tower deployment)
Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.
Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
Troubleshoot issues across the entire stack: hardware, software, application and network
Perform deep dives into both systemic and latent reliability issues; partner with engineering and operation teams across the organization to produce and roll out fixes.
Drive standardization efforts across multiple disciplines and services in conjunction with embedded SREs throughout the organization.
Identify and drive opportunities to improve automation for the cloud services
Scope and create automation for deployment, management and visibility of our services.
On-call coverage requirements and support break-fix needs when required
Required Qualifications
BS /MS degree in Computer Science or related technical field involving systems or equivalent practical experience.
Minimum 5+ years working with Linux /Windows operating system
Minimum of 5+ years scripting in Python/Shell/Bash/Ksh.
Experience with Ansible Tower Administration activities (like install, setup, manage, upgrade, support Ansible Tower deployment)
Experience with Docker, Kubernetes, openshift
Experience with Vmware tools, AWS, Azure cloud and Hypervisor technologies (esxi, kvm, xen)
Experience with Sql/NoSql databases like Mysql, mongodb
Experience with CI/CD tools git /Jenkins
Experience with Terraform /Consul /Nomad/Vault is a plus.
Experience creating and maintaining complex data-driven automations and queries using SQL and noSQL databases.
Proven ability to work independently with minimal supervision and as part of a team with direct responsibilities.
Experience with TCP/IP, routing, DNS, Active Directory, Kerberos, DMZ etc.
Ability to juggle competing priorities and adapt to changes in project scope.
Ability to communicate and collaborate effectively with teammates and internal clients.
Effective verbal and written communication.
Top 3 Must Have Skillsets Required :
Minimum 5+ years working with Linux /Windows operating system
Minimum of 5+ years scripting in Python/Shell/Bash/Ksh.
Experience with Ansible Tower Administration activities
Level of Experience Needed 5+ Years
location: Jersey City, New Jersey
job type: Contract
salary: $60 - 70 per hour
work hours: 9am to 5pm
education: Bachelors
responsibilities:
Responsible for reliability and support of Internal Cloud, Public Cloud (Azure /AWS) and Containers (Dockers) services.
Responsible for Ansible Tower Administration activities (like manage, upgrade, support Ansible Tower deployment)
Support services before they go live through activities such as system design consulting, developing software platforms and frameworks, capacity planning and launch reviews.
Maintain services once they are live by measuring and monitoring availability, latency and overall system health.
Troubleshoot issues across the entire stack: hardware, software, application and network
Perform deep dives into both systemic and latent reliability issues; partner with engineering and operation teams across the organization to produce and roll out fixes.
Drive standardization efforts across multiple disciplines and services in conjunction with embedded SREs throughout the organization.
Identify and drive opportunities to improve automation for the cloud services
Scope and create automation for deployment, management and visibility of our services.
On-call coverage requirements and support break-fix needs when required
qualifications:
Experience level: Experienced
Minimum 5 years of experience
Education: Bachelors
skills:
R/Python
Shell
AWS
Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.