Jobs Career Advice Post Job
X

Send this job to a friend

X

Did you notice an error or suspect this job is scam? Tell us.

  • Posted: Apr 24, 2026
    Deadline: Not specified
    • @gmail.com
    • @yahoo.com
    • @outlook.com
  • Datafin was established in 1999 due to the need for a specialized IT recruitment solution. We offer a personalized and flexible recruitment service, specializing in providing both client and candidate with the perfect fit. We pride ourselves on the fact that we have established relationships with industry leaders and a vast majority of our business is repeat...
    Read more about this company

     

    Senior Computer Systems Engineer (CPT)

    ENVIRONMENT:

    • Our client is a prominent organisation focused on supporting research advancement and human capital development through funding programmes, research infrastructure, and science outreach initiatives across a broad range of disciplines.
    • They are looking for a Senior Computer Systems Engineer guiding infrastructure development, mentoring team members, and ensuring systems align with their principles.
    • Responsibilities include deploying and optimising systems, managing faults, contributing to long-term infrastructure planning, and ensuring scalable, maintainable operations. The position plays a key role in cross-team collaboration, driving innovation while supporting sustainable and resilient computing environments.

    DUTIES

    • Contribute to the global design and implementation of scalable, fault-tolerant infrastructure systems that support engineering and operational needs.
    • Contribute to the deployment, configuration, and maintenance of distributed storage and database systems.
    • Analyse system failures, performance issues, and misconfigurations across hardware, software, and network layers.
    • Lead and mentor computer systems engineers and contribute to strategic technical planning.

    REQUIREMENTS

    Qualification:

    • BTech in Computer Science, Software Engineering, Information Systems, Electronic Engineering or equivalent qualifications, coupled with 13 years of experience; OR
    • BENG/MTech in Computer Science, Software Engineering, Information Systems, Electronic Engineering or equivalent qualifications, coupled with 9 years of experience; OR
    • MENG in Computer Science, Software Engineering, Information Systems, Electronic Engineering or equivalent qualifications, coupled with 7 years of experience; OR
    • PhD in Computer Science, Software Engineering, Information Systems, Electronic Engineering or equivalent qualifications, coupled with 5 years of experience.

    Experience:

    • 3+ years in a technical leadership or software/system architectural role with direct responsibility for large-scale or platform-scale distributed systems.
    • Demonstrated hands-on experience in infrastructure design and automation, distributed systems, observability, CI/CD, container orchestration (e.g., Kubernetes), DevOps/SRE practices, and cloud-native technologies.
    • Experience leading teams or initiatives that intersect with data platforms, storage, networking, and systems engineering domains.

    Knowledge:

    • In-depth understanding of systems engineering principles, including performance optimisation, fault tolerance, and resource scheduling in Linux-based environments.
    • Strong knowledge of containerised environments (Docker, Podman), orchestration platforms (Kubernetes, Helm), and runtime architectures (containerd, CRI).
    • Expertise in infrastructure-as-code, continuous integration/deployment (CI/CD), and configuration management tools (e.g., GitLab CI, Ansible, Terraform, ArgoCD).
    • Advanced understanding of distributed computing and storage architectures, including Ceph, S3, NFS, and local/clustered file systems.
    • Operational and architectural fluency in relational and NoSQL database systems (e.g., PostgreSQL, MySQL, MongoDB), including replication, backups, and performance tuning.
    • Working knowledge of networking fundamentals, security protocols, and systems-level observability (e.g., Prometheus, Grafana, ELK/EFK stack).
    • Familiarity with the HPC ecosystem (e.g., SLURM, job schedulers) is beneficial for environments supporting scientific or research computing.

    ATTRIBUTES

    Core Competencies (Essential):

    • Demonstrated technical leadership (3+ years), leading cross-functional efforts across systems, storage, and database infrastructure, driving technical decisions from architecture through implementation.
    • Systems engineering expertise, with a focus on Linux administration, infrastructure automation, service orchestration, and performance optimisation across diverse environments.
    • Expertise in distributed systems architecture, including the design and deployment of scalable, resilient services using microservices, event-driven, and cloud-native design patterns.
    • Containerisation and orchestration fluency, including production-grade usage of Kubernetes, Docker, and Helm for system and application-level deployments.
    • Infrastructure automation and CI/CD, using tools such as GitLab CI, ArgoCD, FluxCD, Jenkins, or GitHub Actions to streamline and secure platform operations.
    • Complementary DevOps and SRE practices, blending infrastructure-as-code, configuration management, and release automation with incident response, monitoring, SLIs/SLOs, and system reliability engineering.
    • Linux expertise, including advanced troubleshooting, kernel tuning, Systemd orchestration, and optimisation at scale.
    • Technical delivery and planning capabilities, including backlog scoping, cross-team collaboration, and Agile sprint execution.
    • Database administration skills, with operational experience in administering relational and NoSQL databases (e.g., PostgreSQL, MySQL, MongoDB), including high availability, backups, replication, and performance tuning.
    • Diagnostic skills, with a root-cause-first approach, and a strong bias for ownership, accountability, and long-term operational stability.

    Skills:

    • Technical leadership: Ability to lead architectural discussions, influence design decisions, and mentor junior engineers across infrastructure streams.
    • Resource management/leadership: Provides leadership that fosters an environment encouraging new ideas and supports the development of emerging skills. Creates trust through consistency, understanding, integrity, and patience. Plans, seeks, allocates, and monitors resources to achieve outcomes.
    • Problem solving and analysis: Skilled in root cause analysis, systems troubleshooting, and performance bottleneck resolution.
    • Communication and collaboration: Clear articulation of technical recommendations, cross-functional stakeholder engagement, and feedback integration.
    • Planning and delivery: Proficient in backlog grooming, sprint planning, and technical delivery in Agile/DevOps environments.
    • Continuous learning: Commitment to staying current with evolving technologies in containerisation, cloud-native systems, observability, and systems automation.
    • Documentation and knowledge sharing: Ability to produce high-quality technical documentation and share knowledge across engineering teams.
    • Teamwork: Collaborates within their team and with cross-functional teams alongside partners.
    • Service Level Agreements (SLAs): Ability to interpret, monitor, and manage SLAs, warranties, and related contractual obligations, and an understanding of operational frameworks such as Site Reliability Engineering (SRE), ITIL, and COBIT.

    Tooling Proficiency (this is not an exhaustive list; additional relevant experience or skills will be viewed favourably):

    • Containerisation & Orchestration: Kubernetes, Docker, Podman, Helm, containerd
    • Resource Management: SLURM (or other schedulers)
    • Hardware & Infrastructure Acceleration: GPU & FPGA drivers
    • Automation & Configuration Management: Ansible, Terraform, Bash, Python, Systemd, Packer
    • CI/CD and Release Management: GitLab CI, GitHub Actions, Jenkins, Ansible Tower, ArgoCD/FluxCD, cron/at/Systemd timers
    • Cloud, Virtualisation, and Bare-Metal Platforms: OpenStack, VMware vSphere/ESXi, Proxmox, KVM, AWS EC2/Storage, Terraform
    • Storage & Filesystem Tools: Ceph, NFS, iSCSI, ZFS, Lustre, or related
    • Database Operations (Operational DBA Tools): PostgreSQL CLI tools, MySQL, MongoDB, Timescale DB, cron-based backups, or related
    • Monitoring & Observability: Prometheus, Grafana, Zabbix, ELK stack, or related

    Organisational Values:

    The Senior Compute Systems Engineer will be expected to demonstrate the following values and to work actively to instil those behaviours in all their colleagues in South Africa:

    • Diversity and Inclusion
    • Excellence
    • Collaboration
    • Creativity and Innovation
    • Sustainability
    • Passion for Excellence
    • World-class service
    • People-centred approach
    • Respect
    • Integrity and Ethics
    • Accountability

    Check how your CV aligns with this job

    Method of Application

    Interested and qualified? Go to Datafin Recruitment on datafin.com to apply

    Build your CV for free. Download in different templates.

  • Send your application

    View All Vacancies at Datafin Recruitment Back To Home

Career Advice

View All Career Advice
 

Subscribe to Job Alert

 

Join our happy subscribers

 
 
 
Send your application through

GmailGmail YahoomailYahoomail