Deploy, upgrade, operate, and scale our suite of mission-critical products and services.
Manage our underlying infrastructure using modern observability tools to ensure application health.
Manage AWS cloud infrastructure to support our applications.
Setup CI/CD pipelines to streamline software delivery.
Collaborate closely with software engineers to create highly operable and maintainable products.
Engage in and enhance the entire software development lifecycle, from inception and design to deployment, operation, and refinement.
Implement best practices for Kubernetes, Istio, Golang, and AWS to optimize system performance and reliability.
Execute Argo canary deployments to ensure seamless feature rollouts and updates.
Practice sustainable incident response and conduct blameless postmortem.
Provide end-user support to our engineering team for product-related inquiries.
Participate in the team's on-call rotation as needed.
Focus on identifying and resolving performance bottlenecks and implementing performance improvement techniques.
Qualifications
Bachelor's degree in computer science, information systems, or an engineering discipline; OR 2+ years of professional experience in site reliability or DevOps.
Proficiency with Linux operating systems.
Experience
3+ years of experience in DevOps, site reliability engineering, or system administration.
3+ years of experience in Cloud Computing AWS preferred
Experience with infrastructure as code (IaC) products for automated server management.
Experience with container and virtualization technologies like Kubernetes.
Experience with databases and data modeling.
Knowledge and Skills
Knowledge of build systems and package management tools.
Experience with container and virtualization technologies like Kubernetes.
Familiarity with source code and version control tools such as Git or Subversion.
Familiarity with automation frameworks like Terraform, Ansible, or Puppet.
Understanding of TCP/IP networking.
Proficiency with workflow and issue management tools such as JIRA.
Ability to work with mission-critical systems and respond with appropriate urgency.
Effective communication skills for both formal and informal interactions.