Site Reliability Engineer

1 month ago


Cape Town, South Africa Lesaka Technologies Full time

Kazang – Micro Merchant Division
Senior Site Reliability Engineer
A vacancy exists for a Senior SRE within the Kazang - Micro Merchant Division, in Cape Town, South Africa (Hybrid).
We are seeking a Site Reliability Engineer (SRE) with expertise in Linux-based, open-source environments to ensure the reliability, scalability, and performance of our systems. In this role, you will design and implement automated solutions for monitoring and system optimisation while managing and maintaining critical infrastructure. You will work closely with the Dev Ops team to support deployments and CI/CD pipelines, leveraging open-source tools to address operational challenges and enhance system resilience.

Key Responsibilities include, but are not limited to:

- Design, implement, and maintain reliable systems in a Linux and open-source environment to meet uptime and performance objectives.
- Support the Dev Ops team with CI/CD pipelines, ensuring seamless and reliable deployments.
- Manage and optimize AWS-based infrastructure for scalability, cost efficiency, and performance.
- Develop and maintain monitoring and alerting systems to ensure observability and proactively address system issues.
- Build and maintain robust solutions for metric collection, dashboarding, and alerting to provide actionable insights and real-time system visibility.
- Conduct root cause analysis for incidents, implementing preventive measures to improve system resilience.
- Perform regular system maintenance, including updates, patches, and optimizations.
- Prepare and deliver comprehensive reporting on system performance, incidents, and reliability metrics.
- Identify and mitigate risks to system reliability, scalability, and security.
- Ensure compliance with organizational and regulatory standards in system design and operations.
- Participate in a rotational on-call schedule to ensure the reliability and availability of critical systems.

In order to be considered for this position, the following requirements must be met:
Years of Experience:

- A minimum of 5 years of professional experience in Site Reliability Engineering, Dev Ops, or a related field, with demonstrated expertise in Linux-based, open-source environments, and cloud infrastructure (AWS).

Education:

- A Bachelor's degree in Computer Science, Information Technology, Engineering, or a related field is required.
- Equivalent practical experience in lieu of a formal degree will be considered for highly qualified candidates.

Technical Competencies:

- Fault Finding and Debugging
Expertise in diagnosing and resolving complex system issues, including performance bottlenecks, service outages, and application errors, using debugging tools, logs, and monitoring data.
- Scripting and Programming
Proficiency in at least one programming or scripting language (e.g., Python, Bash, Go), with the ability to write automation scripts, develop tools, and optimize system performance.
- Cloud Infrastructure Management (AWS)
Hands-on experience with AWS services (e.g., EC2, S3, RDS, VPC), with the ability to design, manage, and optimize cloud-based infrastructure for scalability, reliability, and cost-efficiency.
- Monitoring and Observability
Skilled in implementing monitoring solutions (e.g., Prometheus, Grafana, ELK stack) and designing systems for metrics collection, dashboarding, and alerting to ensure system health and performance.
- Automation and Infrastructure as Code (Ia C)
Proficiency with tools like Ansible, Terraform, or similar frameworks to automate system management, deployments, and configurations, reducing manual effort and ensuring consistency.

Behavioural Competencies:

- Problem-Solving and Critical Thinking
Demonstrates a proactive and analytical approach to identifying issues, diagnosing root causes, and implementing effective solutions in complex technical environments.
- Collaboration and Teamwork
Works effectively with cross-functional teams, including Dev Ops, development, and operations, fostering a culture of shared ownership and open communication to achieve reliability goals.
- Adaptability and Continuous Learning
Embraces change, learns new technologies quickly, and adjusts strategies to meet evolving system and organizational needs, particularly in fast-paced, dynamic environments.



  • Cape Town, South Africa Communicate Recruitment Full time

    Job Experience & Skills Required:Experience: At least 5 years in Site Reliability Engineering, DevOps, or similar fieldsCloud Expertise: Deep knowledge of cloud platforms like OCI, AWS, GCP, or AzureAutomation Skills: Proficiency in scripting (Python, Bash) and tools like Terraform or AnsibleMonitoring Tools: Mastery of systems like Prometheus, Grafana, or...


  • Cape Town, South Africa Communicate Recruitment Full time

    Job Experience & Skills Required: Experience: At least 5 years in Site Reliability Engineering, DevOps, or similar fieldsCloud Expertise: Deep knowledge of cloud platforms like OCI, AWS, GCP, or AzureAutomation Skills: Proficiency in scripting (Python, Bash) and tools like Terraform or AnsibleMonitoring Tools: Mastery of systems like Prometheus, Grafana, or...


  • Cape Town, Western Cape, South Africa MSP Staffing LTD Full time

    Senior Principal Site Reliability EngineerWe are seeking an experienced Senior Principal Site Reliability Engineer to join our team in a fully remote role. This is a unique opportunity for someone who is passionate about optimizing systems and driving large-scale reliability.Key Requirements5 years of experience overall.BSc/ BTech/ N.Dip degree.Azure DevOps...


  • Cape Town, South Africa Lulalend Full time

    ALL STAFF APPOINTMENTS WILL BE MADE WITH DUE CONSIDERATION OF THE COMPANY’S EE TARGETS Job title: Senior Site Reliability Engineer (Senior Azure Cloud Engineer) Reporting to: Site Reliability Team Lead Location: Cape Town WHAT WE DO We're Lula. We build innovative fintech products to help SMEs make cash flow. From instant access to funding to...


  • Cape Town, South Africa Progressive Edge Full time

    Site Reliability Engineer (SRE) Remote / Cape Town (Must be based in SA) Are you passionate about solving complex problems, working with cutting-edge technologies, and thriving in a dynamic, collaborative environment? Join a team at the forefront of innovation in digital identity, payments, and fraud prevention. We develop backend technologies enabling...


  • Cape Town, Western Cape, South Africa Plus1X Solutions (Pty) Ltd Full time

    About the RolePlus1X Solutions (Pty) Ltd is seeking a skilled Site Reliability Engineer with expertise in Microsoft Azure to join our dynamic team.The ideal candidate will be responsible for receiving, logging, validating, and diagnosing client requests across our full spectrum of products and services, adhering to agreed service level agreements leveraging...


  • Cape Town, South Africa Lula Full time

    ALL STAFF APPOINTMENTS WILL BE MADE WITH DUE CONSIDERATION OF THE COMPANY'S EE TARGETS Job title: Senior Site Reliability Engineer (Senior Azure Cloud Engineer) Reporting to: Site Reliability Team Lead Location: Cape Town WHAT WE DO We're Lula. We build innovative fintech products to help SMEs make cash...


  • Cape Town, South Africa Electrum Payments Full time

    Electrum is the next-generation payments technology company that provides cloud-native software to optimise the processing of financial transactions. Since 2012, we have established ourselves as a respected payments technology partner through our deep expertise and track record in delivering trusted enterprise-grade payments solutions. We’ve built a...


  • Cape Town, South Africa Electrum Payments Full time

    Electrum is the next-generation payments technology company that provides cloud-native software to optimise the processing of financial transactions. Since 2012, we have established ourselves as a respected payments technology partner through our deep expertise and track record in delivering trusted enterprise-grade payments solutions. We’ve built a...


  • Cape Town, South Africa Electrum Payments Full time

    Electrum is the next-generation payments technology company that provides cloud-native software to optimise the processing of financial transactions. Since 2012, we have established ourselves as a respected payments technology partner through our deep expertise and track record in delivering trusted enterprise-grade payments solutions.We’ve built a...


  • Western Cape, South Africa MSP Staffing LTD Full time

    Our client is seeking an experienced Senior Principal Site Reliability Engineer to join their team in a fully remote role This is an incredible opportunity for someone who is passionate about optimizing systems and driving large-scale reliability. Key Requirements · 5 years experience overall. · BSc/ BTech/ N.Dip · Azure DevOps · IaC · Azure services ·...


  • Cape Town, South Africa Travellab Africa Group Full time

    Our Travelstart team is seeking an SRE (Site Reliability Engineer) for our Dev Team.  This role ensures the reliability, performance, and scalability of the Travelstart systems. This role bridges the gap between software development and system operations, focusing on automating infrastructure and processes to improve reliability and efficiency.(This...


  • Cape Town City Centre, South Africa Jobted ZA C2 Full time

    Job Experience & Skills Required: Experience: At least 5 years in Site Reliability Engineering, DevOps, or similar fields Cloud Expertise: Deep knowledge of cloud platforms like OCI, AWS, GCP, or Azure Automation Skills: Proficiency in scripting (Python, Bash) and tools like Terraform or Ansible Monitoring Tools: Mastery of systems like Prometheus, Grafana,...


  • Cape Town, Western Cape, South Africa Datafin Recruitment Full time

    Job OverviewA site reliability engineer with 5+ years of experience in cloud services, scripting languages, and configuration management tools is sought to join Datafin Recruitment. The ideal candidate will have a strong understanding of networking concepts, containerization, and orchestration.

  • Reliability Engineer

    6 months ago


    Cape Town, South Africa Kimberly-Clark Full time

    Description Role: Reliability Engineer Location: Epping Mill What is it like to work at Kimberly-Clark?   Huggies®. Kleenex®. Andrex®. Scott®. Kotex®. Poise®. Depend®. You already know our legendary brands—and so does the rest of the world. In fact, 25% of people in the world use Kimberly-Clark products every day. The identity of...

  • Site Engineer

    6 months ago


    Cape Town, South Africa Armstrong Appointments Full time

    Exciting opportunity alert! Join a dynamic mid-sized construction firm in Cape Town as a Site Engineer and play a pivotal role in driving project success! Your mission: - Oversee and coordinate all aspects of construction project management- Ensure compliance with engineering specs, industry standards, and regulations- Foster seamless communication with...

  • Site Engineer

    3 months ago


    Cape Town, South Africa Armstrong Appointments Full time

    Exciting opportunity alert! Join a dynamic mid-sized construction firm in Cape Town as a Site Engineer and play a pivotal role in driving project success! Your mission: - Oversee and coordinate all aspects of construction project management- Ensure compliance with engineering specs, industry standards, and regulations- Foster seamless communication with...


  • Cape Town, South Africa Haleon Full time

    Hello. We’re Haleon. A new world-leading consumer health company. Shaped by all who join us. Together, we’re improving everyday health for billions of people. By growing and innovating our global portfolio of category-leading brands - including Sensodyne, Panadol, Advil, Voltaren, Theraflu, Otrivin, and Centrum - through a unique combination of deep...


  • Cape Town, South Africa Haleon Full time

    Hello. We’re Haleon. A new world-leading consumer health company. Shaped by all who join us. Together, we’re improving everyday health for billions of people. By growing and innovating our global portfolio of category-leading brands - including Sensodyne, Panadol, Advil, Voltaren, Theraflu, Otrivin, and Centrum - through a unique combination of deep...


  • Cape Town, South Africa Lula Full time

    **WHAT WE DO** We're Lula. We build innovative fintech products to help SMEs make cash flow. From instant access to funding to all-in-one business banking accounts and cutting-edge financial analysis tools, we're on it! Our purpose is to help SMEs manage their business better, faster, simpler, Lula, so they can spend more time doing what they...