Listing Description
If you are a passionate engineering leader who thrives in a fast-paced, high impact and high growth environment, please read on! We are looking for a DevOps Engineer to continuously improve our development operations and support the reliability and availability of all our applications and services deployed to the cloud.
What you get to do:
- Partner with various engineering teams to own and manage availability, latency, performance, reliability and scalability of all services to maintain SLAs that our customers expect from us
- Provide strong technical leadership and people management to the team
- Act as a SME and demonstrate end-to-end ownership of SRE function with automation-first mindset
- Collaborate with Engineering Managers, Product Managers, Software Engineers across the organization to deliver a comprehensive solution for SRE discipline and set effective SLOs/SLIs and error budgets for all services
- Define well-defined processes, methodologies, metrics and KPIs to drive accountability
- Own and drive cloud cost management, capacity planning and performance management and partner with engineering leaders to optimize for performance and cost
- Drive instrumentation of services for end-to-end Observability - Monitoring, Alerting, Metrics, Logging and Dashboard
- Own and drive incident management process, conduct blameless post mortems and publish incident reports, maintain runbooks and on-call schedules
- Coordinate with product development engineering teams on change management and release management processes
- Be an evangelist for SRE best practices
- Champion and implement strong DevSecOps principles and SDLC best practices
- Coach, mentor and cross-train team members
- Servant leadership and Agile DNA
What you bring to the role:
- 5+ years of professional software engineering experience
- 5+ years of building and managing technical processes in SRE, DevOps or other related domains
- Experience supporting infrastructure and services in public cloud environments; AWS required
- Experience leading SRE team with strong emphasis on automation and continuous improvement
- Hands-on experience in developing applications in one or more language stacks: Java, Python, Ruby, Javascript, Go, etc.
- Strong hands-on knowledge of one or more of Infrastructure-as-Code tools and technologies: Terraform, AWS CloudFormation, Packer, etc.
- Strong knowledge of network engineering and foundational network protocols and services such as TCP/IP, HTTP, DHCP, DNS, VPN, etc.
- Experience with Agile software development and Scrum methodology
- Great communication, collaboration and presentation skills
- Strong problem solving and analytical skills
Preferred Qualifications:
- AWS certification
- Expertise in Linux systems
- Experience building and managing fault tolerant large scale distributed systems
- Strong hands-on experience in one or more of Containers and Container Orchestration frameworks: Docker, Kubernetes, Amazon ECS, Amazon EKS, Amazon Fargate, etc.
- Experience with CI/CD, DevOps and Pipeline-As-Code: ArgoCD, Jenkins, Spinnaker, Gitlab CI/CD, etc.
- Experience with Microservices architecture and CNCF ecosystem
- Bachelor’s or Master’s Degree in Computer Science, Engineering or related discipline
Listing Details
- Citizenship: Not Provided
- Incentives: Not Provided
- Education: Not Provided
- Travel: Not Provided
- Telework: Not Provided