Responsibilities
Design, implementation and maintenance of public-facing infrastructure and services
Use of configuration management and deployment tools
Architectural design and operation at scale
Monitoring of systems and services, optimization of performance and resource utilization
Common operating system-level tasks such as logging and backup/restore
Cookbook / runbook implementation for common maintenance actions
Incident response, diagnosis and follow-up on system outages or alerts
Automation and streamlining of tasks as well as identifying process gaps
Collaborating with a global and asynchronously communicating team (don’t worry if you have never worked remotely, we’ll help you get used to it)
Mentoring peers in your areas of technical and operational strength
Skills and Experience:
5+ years experience in an SRE/Operations/DevOps role
Experience with operating highly available infrastructure
Experience with running applications and services at scale
Proficient with shell and a programming language used in an SRE/Operations engineering context (Python, Go, Ruby, etc.)
Comfortable with Open Source configuration management and orchestration tools (Puppet, Ansible, TerraForm etc.)
Communicative technical English