Vacancy expired!
This is a senior role and no sponsorship is available now or in the future. No 3rd party candidates
Our client is one of the world's largest online marketplaces, dealing with one billion events per day.
We are looking for Senior Site Reliability Engineer or DevOps with strong scripting and development skills, who will contribute to the management and optimization of complex infrastructure based on AWS and Kubernetes, automate routine jobs of management of thousands of nodes and dozens of clusters, and reduce the operational cost, like disk utilization, CPU utilization, etc.
MAIN TASKS AND RESPONSIBILITIES
Obtains tasks from the project lead or Team Lead (TL), prepares functional and design specifications, approves them with all stakeholders.
Ensures that assigned area/areas are delivered within set deadlines and required quality objectives.
Provides estimations, agrees task duration with the manager and contributes to project plan of assigned area.
Analyzes scope of alternative solutions and makes decision about area implementation based on his/her experience and technical expertise.
Leads functional and architectural design of assigned areas. Makes sure design decisions on the project meet architectural and design requirements.
Addresses area-level risks, provides and implements mitigation plan.
Reports about area readiness/quality, and raises red flags in crisis situations which are beyond his/her AOR.
Responsible for resolving crisis situations within his/her AOR.
Constantly improves his/her professional level.
Collaborates with other teams.
REQUIRED EDUCATION AND EXPERIENCE
Must have:
University degree in Computer Related Sciences or similar
5+ years of commercial experience as Site Reliability Engineer, or DevOps, or similar
2+ years of experience with AWS stack (VPC, EC2, S3, KMS, ECR, IAM, Lambda, CloudWatch)
Development experience with Python, or Go, or Ruby
Knowledge of Linux and bash better than your own house ;)
Experience with migrations, upgrades and monitoring
Hands-on experience of working with production environments with zero tolerance to any errors
Strong understanding of Kubernetes ecosystem.
Experience in Agile development environments
Good English (oral & written) and communication skills in general