Cloud Operations Engineer

Cloud Operations Engineer

03 Dec 2024
New Jersey, Holmdel, 07733 Holmdel USA

Cloud Operations Engineer

Job OverviewThe Cloud Operations Team, part of iCIMS Labs, is responsible for monitoring the application and infrastructure to help provide a customer experience second to none. The team monitors the customer experience ranging from the validation of availability and performance, orchestrating events of cross functional teams, and the communication of customer affecting events. In addition, the team reviews all key performance indicators (KPI) to forecast future performance and make initial recommendations to the engineering team. The Application Monitoring Engineer will report to the Manager, Cloud Operations and will be responsible for overseeing the functionality of applications, batch processes, network and infrastructure components ensuring the highest possible availability (99.9% - 99.99%) and facilitate timely resolution of incidents or technical escalations to meet established SLAs.About UsWhen you join iCIMS, you join the team helping global companies transform business and the world through the power of talent. Our customers do amazing things: design rocket ships, create vaccines, deliver consumer goods globally, overnight, with a smile. As the Talent Cloud company, we empower these organizations to attract, engage, hire, and advance the right talent. We’re passionate about helping companies build a diverse, winning workforce and about building our home team. We're dedicated to fostering an inclusive, purpose-driven, and innovative work environment where everyone belongs.Responsibilities

Manage the Production environment by monitoring availability/performance on a holistic level.

Utilize monitoring tools to track cloud resource utilization and performance metrics.

Generate regular reports on system performance and propose enhancements based on data analysis.

Incident Management: Manage the process to restore normal service operations as quickly as possible, including problem assessment, research, escalation, communications, and resolution task management.

Execute on Production changes needed to support internal and external customers.

Provide triage support for Operational support requests.

Review and refine support documentation, SOPs, policies, procedures, and associated system requirements.

Develop and maintain automation scripts using Python and Java to streamline operational processes.

Implement Infrastructure as Code (IaC) practices to enhance deployment efficiency and consistency.

Prepare extensive electronic documentation that includes by not limited to, SLAs, Performance Metrics, installation guides, implementation guides, etc.

Reduce manual work by identifying repeatable tasks that can be handled by automation.

Participate in monthly metric reviews in support of 99.9% -99.99% uptime.

Must be passionate and self-driven with a sense of urgency and initiative.

Proactively seek innovative solutions and resolve issues effectively and efficiently.

Qualifications

Bachelor of Science in Computer Science or combination of education and/or equivalent work experience.

Novice level programming.

Strong interpersonal and communication skills.

Maintain and validate documentation, including support processes, SOPs, Policies & procedures.

Ability to effectively prioritize and execute tasks.

Strong attention to detail and ownership of deliverables.

Preferred

Bachelor’s Degree in a technical field such as Computer Science, Information Systems, Math, or Software Engineering.

In lieu of bachelor’s or technical degree, we will accept 2 additional years of software or operational engineering experience.

Experience with AWS, SaaS applications, and other modern cloud-based tools

AWS Cloud Practitioner level certification or other equivalent commercial cloud certification preferred.

Experience with monitoring and automation solutions. (Zabbix, Grafana, Rundeck, Terraform, and Ansible)

Experience with Python and Java

Experience with infrastructure as code (IaC) tools (e.g., Terraform, CloudFormation).

EEO StatementiCIMS is a place where everyone belongs. We celebrate diversity and are committed to creating an inclusive environment for all employees. Our approach helps us to build a winning team that represents a variety of backgrounds, perspectives, and abilities. So, regardless of how your diversity expresses itself, you can find a home here at iCIMS.We are proud to be an equal opportunity and affirmative action employer. We prohibit discrimination and harassment of any kind based on race, color, religion, national origin, sex (including pregnancy), sexual orientation, gender identity, gender expression, age, veteran status, genetic information, disability, or other applicable legally protected characteristics. If you would like to request an accommodation due to a disability, please contact us at careers@icims.com.Compensation and BenefitsThe target total compensation for this role is $X. Compensation will be based upon experience.Competitive health and wellness benefits include medical, dental, vision, 401(k), dependent care, short term and long term disability, life and AD&D insurance, bonding and parental leave, mindfulness resources, an open vacation policy, sick days, paid holidays, quiet hours each workday, and tuition reimbursement. Benefits and eligibility may vary by location, role, and tenure. Learn more here: https://careers.icims.com/benefits

Job Details

Jocancy Online Job Portal by jobSearchi.