A passion for wellness, community, sustainable design and smiles.
Why do we do what we do? We respect that you, like us, believe that travel can serve a purpose. You want us to be honest and keep it real, with genuine actions that merge the two platforms of wellness and sustainability to promote personal health, and the health of our planet. Experience some...
Read more about this company
A well-established business is seeking to appoint a Senior Compute Systems Engineer
The Senior Computer Systems Engineer, will lead the compute and storage systems team and will report to the Site Reliability Engineering (SRE) Manager within Computing & Software, providing hands-on technical leadership in the design, implementation, and long-term operation and maintenance of secure, reliable, and high-performance computer systems infrastructure for the Telescopes hosted by company.
Qualifications:
BTech in Computer Science, Software Engineering, Information Systems, Electronic
Engineering or equivalent qualifications coupled with 13 years’ experience,
BENG/MTech in Computer Science, Software Engineering, Information Systems,
Electronic Engineering or equivalent qualifications coupled with 9 years’ experience, MENG in Computer Science, Software Engineering, Information Systems, Electronic Engineering or equivalent qualifications coupled with 7 years’ experience,
PHD in Computer Science, Software Engineering, Information Systems, Electronic
Engineering or equivalent qualifications coupled with 5 years’ experience.
Experience:
3+ years in a technical leadership or software/system architectural role with direct responsibility for large-/platform-scale distributed systems.
Demonstrated hands-on experience in infrastructure design and automation, distributed systems, observability, CI/CD, container orchestration (e.g. Kubernetes), DevOps/SRE practices and cloud-native technologies.
Experience leading teams or initiatives that intersect with data platforms, storage, networking, and systems engineering domains
Knowledge:
In-depth understanding of systems engineering principles, including performance optimization, fault tolerance, and resource scheduling in Linux-based environments.
Strong knowledge of containerized environments (Docker, Podman), orchestration platforms (Kubernetes, Helm), and runtime architectures (containerd, CRI).
Expertise in infrastructure-as-code, continuous integration/deployment (CI/CD), and configuration management tools (e.g., GitLab CI, Ansible, Terraform, ArgoCD).
Advanced understanding of distributed computing and storage architectures, including Ceph, S3, NFS, and local/clustered file systems.
Operational and architectural fluency in relational and NoSQL database systems (e.g., PostgreSQL, MySQL, MongoDB), including replication, backups, and performancetuning.
Working knowledge of networking fundamentals, security protocols, and systems level observability (e.g., Prometheus, Grafana, ELK/EFK stack).
Familiarity with the HPC ecosystem (e.g., SLURM, job schedulers) is beneficial for environments supporting scientific or research computing.