Jobs Career Advice Signup
X

Send this job to a friend

X

Did you notice an error or suspect this job is scam? Tell us.

  • Posted: Jun 4, 2025
    Deadline: Not specified
    • @gmail.com
    • @yahoo.com
    • @outlook.com
  • Never pay for any CBT, test or assessment as part of any recruitment process. When in doubt, contact us

    Teraco is the first provider of resilient, vendor neutral data environments in South Africa. Clients benefit from the cost savings and improved resilience of securely housing their information systems and networking equipment in a colocation facility purpose-built and operated to global best practice by an expert organisation with an absolute focus on data c...
    Read more about this company

     

    Incident and Problem Manager

    MAIN FUNCTIONS OF THE JOB

    Problem Management:

    • Analysing incidents to identify recurring patterns
    • Conduct root cause analysis to understand the underlying causes of problems. 
    • Developing and implementing corrective actions to address root causes and eliminate future incidents. 
    • Working with relevant teams to implement solutions and updates to prevent similar problems. 
    • Ensure response teams are coordinated and effective in investigating and resolving major complex problems. (Responsible team will assume incident management responsibility for a given event)
    • Collaborate with subject matter experts to resolve complex problems & track problem lifecycle from identification to resolution. 
    • Track tickets for all corrective actions and validate that the corrective actions are implemented as required.  
    • Maintain a problem knowledge base and documentation to share learnings across the organization to facilitate quicker resolution of similar incidents in the future
    • Manage problem resolution bridges, provide timely and clear updates to stakeholders, and document critical action items to drive resolutions.
    • Own and lead a structured Root Cause Analysis (RCA) process to resolve major incidents and problems. 
    • Facilitate root cause and corrective action plan meetings, after the implementation of the correction. Ensure the responsible managers, documenting incident details and post-incident analysis to learn from events, and that incident reports reflect all root causes, corrections and corrective actions. 
    • Drive teams to document and submit incident reports within OLA and SLA
    • Signatory on all incident reports across the business. 
    • In collaboration with the Client Experience Manager, identify improved reporting formats and templates. Drive consistency across Teraco’s operational organisation. 
    • Review incident response plans and procedures and identify improvement opportunities using data and metrics

    Incident and Problem Management Framework:

    • Implement a clear and concise Incident and Problem Management framework to ensure incidents are handled in line with established policies and procedures, and to increase efficiency of incident response
    • Establish various root cause analysis techniques to identify the root causes and coach leadership in effective root cause analysis where required to drive a culture of effective root cause analysis.
    • Ensure communication plans are in place and ready for activation during major incidents
    • Create communication and escalation framework to ensure stakeholders are kept up to date about the incident status and impact. DCO staff will assume incident management responsibility for a given incident & Facilitate communication during incidents to ensure coordinated response.
    • Collaborate with the Client Experience Manager on client impacting incidents, to ensure client’s interests are central to Teraco’s response to incidents, and that there is effective communication with clients. 

    QUALIFICATIONS AND EXPERIENCE

    • Bachelor’s degree in a relevant field (e.g., IT, Engineering, Business Management, or similar) preferred, or equivalent experience
    • Certifications (highly beneficial):
    • ITIL v3/v4 Foundation or Intermediate Level
    • RCA/Problem Solving training (e.g., Kepner-Tregoe, Six Sigma Yellow/Green Belt)
    • ISO standards familiarity (especially ISO 27001, 50001 or ISO 9001)
    • 5+ years in incident and/or problem management roles, ideally within data centre, critical machinery and/or electrical infrastructure or similar high-availability environments
    • Experience in managing major incidents and leading post-mortems
    • Proven track record of implementing effective corrective and preventive action plans
    • Familiarity with operational workflows in critical facilities (e.g., infrastructure systems, networks)
    • Experience collaborating with client-facing and technical teams
    • Background in managing communication during major service disruptions
    • Experience working within Root Cause and Corrective Action frameworks

    Check if your CV matches this job with MyJobMag AI

    Method of Application

    Interested and qualified? Go to Teraco on teraco.mcidirecthire.com to apply

    Build your CV for free. Download in different templates.

  • Send your application

    View All Vacancies at Teraco Back To Home

Subscribe to Job Alert

 

Join our happy subscribers

 
 
Send your application through

GmailGmail YahoomailYahoomail