Site Reliability Engineer - Associate

Site Reliability Engineer - Associate

18 Jun 2024
Pennsylvania, Philadelphia, 19019 Philadelphia USA

Site Reliability Engineer - Associate

Vacancy expired!

Site Reliability Engineer - Associate - TOP COMMERCE PLATFORM

Full-time

Associate Site Reliability EngineerPHILADELPHIA, PENNSYLVANIA /
INFRASTRUCTURE – PLATFORM OPERATIONS /
FULL TIME


Primary Job Responsibilities
  • Maintain live services through measuring and monitoring availability, latency, and overall system health.
  • Identify measurable SLIs across all software products and build proactive monitoring around them.
  • Collaborate with other team members to quickly determine root cause of any type of service degradation and look for key indicators of potential issues.
  • Effectively communicate with third parties, partners, and internal teams regarding technical issues.
  • Automate existing procedures that currently require manual effort.
  • Use acquired knowledge to suggest, edit, and write knowledge base articles and SOPs for alerts and other processes.
  • Collaborate with infrastructure and development teams to improve system reliability emphasizing reliability as a core value.
  • Identify anomalies and escalate appropriately issues identified in application and system log files.
  • Obtain a deep understanding of the platform and custom application stack to help resolve issues.
  • Create new alerts to support the operations of production systems.
  • Troubleshoot and remediate failed scheduled jobs.
  • Available for periodic support on nights and weekends.
Required Skills and Experience· BSc degree in Computer Science or equivalent degree
· 3-5 years of technical experience in Web based technology environments and related dependencies between platforms, database, and application levels.
· Experience in Microsoft SQL Server product stack; Databases, Integrations Services, and Reporting Services
· Strong SQL skills are required
· Experience with scripting languages such as PowerShell/Python
· Experience in enterprise incident management practices
· Working knowledge of application development
· Ability to quickly receive and process information, make appropriate risk-based decisions.
Desired· Knowledge of networking protocols, DNS, HTTP, load balancing, web servers
· Experience in the support of real time transaction processing applications
· Experience with Azure Cloud
· Experience with RunDeck Scheduling System

Job Details

Jocancy Online Job Portal by jobSearchi.