Site Reliability Engineer - Associate - TOP COMMERCE PLATFORM
Full-time
Associate Site Reliability EngineerPHILADELPHIA, PENNSYLVANIA / INFRASTRUCTURE – PLATFORM OPERATIONS / FULL TIME
Primary Job Responsibilities
Maintain live services through measuring and monitoring availability, latency, and overall system health.
Identify measurable SLIs across all software products and build proactive monitoring around them.
Collaborate with other team members to quickly determine root cause of any type of service degradation and look for key indicators of potential issues.
Effectively communicate with third parties, partners, and internal teams regarding technical issues.
Automate existing procedures that currently require manual effort.
Use acquired knowledge to suggest, edit, and write knowledge base articles and SOPs for alerts and other processes.
Collaborate with infrastructure and development teams to improve system reliability emphasizing reliability as a core value.
Identify anomalies and escalate appropriately issues identified in application and system log files.
Obtain a deep understanding of the platform and custom application stack to help resolve issues.
Create new alerts to support the operations of production systems.
Troubleshoot and remediate failed scheduled jobs.
Available for periodic support on nights and weekends.
Required Skills and Experience· BSc degree in Computer Science or equivalent degree · 3-5 years of technical experience in Web based technology environments and related dependencies between platforms, database, and application levels. · Experience in Microsoft SQL Server product stack; Databases, Integrations Services, and Reporting Services · Strong SQL skills are required · Experience with scripting languages such as PowerShell/Python · Experience in enterprise incident management practices · Working knowledge of application development · Ability to quickly receive and process information, make appropriate risk-based decisions. Desired· Knowledge of networking protocols, DNS, HTTP, load balancing, web servers · Experience in the support of real time transaction processing applications · Experience with Azure Cloud · Experience with RunDeck Scheduling System